[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8
Astara
launchpad at tlinx.org
Sun May 27 05:02:27 UTC 2012
UTF-8 collating has Upper case sorted before lower case.
from: http://unicode.org/reports/tr10/#Case_Comparisons
6.6 Case Comparisons
In some languages, it is common to sort lowercase before uppercase; in
other languages this is reversed. Often this is more dependent on the
individual concerned, and is not standard across a single language. It
is strongly recommended that implementations provide parameterization
that allows uppercase to be sorted before lowercase, and provides
information as to the standard (if any) for particular countries. This
can easily be done to the Default Unicode Collation Element Table before
tailoring by remapping the L3 weights (see Section 7, Weight
Derivation). It can be done after tailoring by finding the case pairs
and swapping the collation elements.
----
Anyone not following the above is should likely not claim Unicode
compatibilty.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to bash in Ubuntu.
https://bugs.launchpad.net/bugs/120687
Title:
Caseless collate sequence in en_GB.UTF8
Status in “bash” package in Ubuntu:
Confirmed
Bug description:
do this in a gash (temporary) directory:
touch A a B b
ls [A-Z]*
you get:-
A b B
What most people (especially unix users with >25 years experience)
expect is:-
A B
I found out about this by accident yesterday by doing "rm [A-Z]*" in a
directory expecting only files with a initial uppercase letter to be
removed. You can imagine my surprise when every file (except those
starting in 'a') where removed. Fortunately most of the files were
either redundant or backed up, but it still caused me a completely
unnecessary hour's work to restore the damage.
Obviously the collating sequence is aAbBcCdD... but that really does
*not* make it right. Other linux distros do not have this problem, but
then they seem to set:
export LC_COLLATE=C
as standard, which is missing in a standard ubuntu installation
(6.06lts -> 7.04)
That *is* the work around, but I would respectfully suggest that you
set it as standard before someone destroys something irreplaceable!
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/120687/+subscriptions
More information about the foundations-bugs
mailing list