<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">2015-04-05 20:50 GMT+02:00 Normand Marion <span dir="ltr"><<a href="mailto:normand.marion@gmail.com" target="_blank">normand.marion@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div><div>^ Locates regular expression that follows at the beginning of line.</div></div></div></blockquote><div><br></div><div>Not inside [ ]. It means ”NOT”. Replace everything that is NOT ”(” OR ”–” with nothing.</div><div>The idea is that ONLY ”(” and ”–” should be returned so that I can count them later… In this case there is no ”–”, so the expected output is ”(”, but I also get all those Japanese characters returned, which is not what I want. It works with sed, though.</div><div><br></div><div><br></div><div><div>Kind regards</div><div><br></div><div>Johnny Rosenberg</div><div>ジョニー・ローゼンバーグ</div></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div><div> The ^ is only special when it occurs at the beginning of the regular expression.<br><br></div>So in the sed command ^ followed by your range means that any of the [range of characters] at the beginning of line never happened. ^ and /g are self excluding <br><br></div>As I understand your expression as to remove '(|–' you should write<br><br>a="Black Sand Beach (ブラック・サンド・ビーチ)"; echo $a | sed 's/[(|–]//g'<br></div><div class="gmail_extra"><br><div class="gmail_quote">2015-04-05 5:16 GMT-04:00 Nils Kassube <span dir="ltr"><<a href="mailto:kassube@gmx.net" target="_blank">kassube@gmx.net</a>></span>:<div><div class="h5"><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span>Johnny Rosenberg wrote:<br> > This is what I want. Everything except ”(” and ”–” (n-dash) should be<br> > removed.<br> > However, using pure Bash doesn't seem to work:<br> ><br> > $ a="Black Sand Beach (ブラック・サンド・ビーチ)"; echo ${a//[^(|–]}<br> > (ブラック・サンド・ビーチ<br> > $<br> > Why does Bash consider n-dash and some Japanese characters the same?<br> <br> </span>Maybe the three character encoding of your n-dash and the Japanese<br> characters look similar enough to bash?<br> <span><br> > Is there a setting I need to do somehow?<br> <br> </span>It has something to do with your locale settings. If you try the same<br> with the default locale, the result is different. So I think you should<br> set one of the LC_* shell variables (probably LC_COLLATE) to Japanese<br> and try your command again.<br> <span><font color="#888888"><br> <br> Nils<br> <br> <br> --<br> ubuntu-users mailing list<br> <a href="mailto:ubuntu-users@lists.ubuntu.com" target="_blank">ubuntu-users@lists.ubuntu.com</a><br> Modify settings or unsubscribe at: <a href="https://lists.ubuntu.com/mailman/listinfo/ubuntu-users" target="_blank">https://lists.ubuntu.com/mailman/listinfo/ubuntu-users</a><br> </font></span></blockquote></div></div></div><span class=""><font color="#888888"><br><br clear="all"><br>-- <br><div><font size="4"><b>Normand Marion</b></font><br><br><span style="font-family:'comic sans ms',sans-serif"><a href="mailto:normand.marion@gmail.com" target="_blank">normand.marion@gmail.com</a></span></div> </font></span></div> <br>--<br> ubuntu-users mailing list<br> <a href="mailto:ubuntu-users@lists.ubuntu.com">ubuntu-users@lists.ubuntu.com</a><br> Modify settings or unsubscribe at: <a href="https://lists.ubuntu.com/mailman/listinfo/ubuntu-users" target="_blank">https://lists.ubuntu.com/mailman/listinfo/ubuntu-users</a><br> <br></blockquote></div><br></div></div>