[Bug 821951] Re: sort -u erase some utf8 characters
An Yang
821951 at bugs.launchpad.net
Sat Aug 6 15:36:55 UTC 2011
The reason is eglibc/glibc just supports CJK UNIFIED IDEOGRAPH (<U4E00>- <U9FA5>) defined in iso10646:1993.
EGlibc/glibc lack support of CJK UNIFIED IDEOGRAPH A/B/C/D defined in iso10646:2011.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to coreutils in Ubuntu.
https://bugs.launchpad.net/bugs/821951
Title:
sort -u erase some utf8 characters
Status in “eglibc” package in Ubuntu:
Confirmed
Bug description:
sort -u will erase some utf8 characters.
see attachment for detail data.
sort -u x.sorted.utf8 > x.sorted.uniq.utf8
diff x.sorted.uniq.utf8 x.sorted.utf8 > x.diff
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/821951/+subscriptions
More information about the foundations-bugs
mailing list