Bash substitute gives unexpected results (?)

Johnny Rosenberg gurus.knugum at gmail.com
Sun Apr 5 20:50:17 UTC 2015


2015-04-05 20:50 GMT+02:00 Normand Marion <normand.marion at gmail.com>:

> ^ Locates regular expression that follows at the beginning of line.
>

Not inside [ ]. It means ”NOT”. Replace everything that is NOT ”(” OR ”–”
with nothing.
The idea is that ONLY ”(” and ”–” should be returned so that I can count
them later… In this case there is no ”–”, so the expected output is ”(”,
but I also get all those Japanese characters returned, which is not what I
want. It works with sed, though.


Kind regards

Johnny Rosenberg
ジョニー・ローゼンバーグ



> The ^ is only special when it occurs at the beginning of the regular
> expression.
>
> So in the sed command ^ followed by your range means that any of the
> [range of characters] at the beginning of line never happened. ^ and /g are
> self excluding
>
> As I understand your expression as to remove '(|–' you should write
>
> a="Black Sand Beach (ブラック・サンド・ビーチ)"; echo $a | sed 's/[(|–]//g'
>
> 2015-04-05 5:16 GMT-04:00 Nils Kassube <kassube at gmx.net>:
>
> Johnny Rosenberg wrote:
>> > This is what I want. Everything except ”(” and ”–” (n-dash) should be
>> > removed.
>> > However, using pure Bash doesn't seem to work:
>> >
>> > $ a="Black Sand Beach (ブラック・サンド・ビーチ)"; echo ${a//[^(|–]}
>> > (ブラック・サンド・ビーチ
>> > $
>> > Why does Bash consider n-dash and some Japanese characters the same?
>>
>> Maybe the three character encoding of your n-dash and the Japanese
>> characters look similar enough to bash?
>>
>> > Is there a setting I need to do somehow?
>>
>> It has something to do with your locale settings. If you try the same
>> with the default locale, the result is different. So I think you should
>> set one of the LC_* shell variables (probably LC_COLLATE) to Japanese
>> and try your command again.
>>
>>
>> Nils
>>
>>
>> --
>> ubuntu-users mailing list
>> ubuntu-users at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>>
>
>
>
> --
> *Normand Marion*
>
> normand.marion at gmail.com
>
> --
> ubuntu-users mailing list
> ubuntu-users at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20150405/e5dd58de/attachment.html>


More information about the ubuntu-users mailing list