[Bug 1248239] Re: sorting fails on Japanese Unicode characters

Fumihito YOSHIDA hito at kugutsu.org
Wed Nov 6 01:51:49 UTC 2013


'sort' command does not support "human graspable sorting" in unicode environments.
http://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021

I  suggest solution, sort with "LC_ALL=C" variables (or set in your
aliases).

This is bad spec, but this is unalterable by historical reason.

$ ls -1 | sort 
①
③
②
②-test.txt
⑤-test.txt
④-test.txt
⑥-test.txt
①-test.txt
③-test.txt

$ ls -1| LC_ALL=C sort
①
①-test.txt
②
②-test.txt
③
③-test.txt
④-test.txt
⑤-test.txt
⑥-test.txt

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to coreutils in Ubuntu.
https://bugs.launchpad.net/bugs/1248239

Title:
  sorting fails on Japanese Unicode characters

Status in “bash” package in Ubuntu:
  New
Status in “coreutils” package in Ubuntu:
  New

Bug description:
  there seems to be some oddity in interpreting some Japanese unicode
  characters in bash and how they should be sorted.

  $ ls -1 /tmp/*.txt
  /tmp/⑥-test.txt
  /tmp/⑤-test.txt
  /tmp/④-test.txt
  /tmp/①-test.txt
  /tmp/③-test.txt
  /tmp/②-test.txt
  $ ls -1 /tmp/*.txt|sort
  /tmp/⑥-test.txt
  /tmp/⑤-test.txt
  /tmp/④-test.txt
  /tmp/①-test.txt
  /tmp/③-test.txt
  /tmp/②-test.txt

  This is while booted into an uptodate precise system.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/1248239/+subscriptions



More information about the foundations-bugs mailing list