[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

Astara launchpad at tlinx.org
Sun May 27 05:02:27 UTC 2012


UTF-8 collating has Upper case sorted before lower case.
from:  http://unicode.org/reports/tr10/#Case_Comparisons

6.6 Case Comparisons

In some languages, it is common to sort lowercase before uppercase; in
other languages this is reversed. Often this is more dependent on the
individual concerned, and is not standard across a single language. It
is strongly recommended that implementations provide parameterization
that allows uppercase to be sorted before lowercase, and provides
information as to the standard (if any) for particular countries. This
can easily be done to the Default Unicode Collation Element Table before
tailoring by remapping the L3 weights (see Section 7, Weight
Derivation). It can be done after tailoring by finding the case pairs
and swapping the collation elements.


----

Anyone not following the above is should likely not claim Unicode
compatibilty.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to bash in Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

Status in “bash” package in Ubuntu:
  Confirmed

Bug description:
  do this in a gash (temporary) directory:

  touch A a B b
  ls [A-Z]*

  you get:-

  A  b  B

  What most people (especially unix users with >25 years experience)
  expect is:-

  A B

  I found out about this by accident yesterday by doing "rm [A-Z]*" in a
  directory expecting only files with a initial uppercase letter to be
  removed. You can imagine my surprise when every file (except those
  starting in 'a') where removed. Fortunately most of the files were
  either redundant or backed up, but it still caused me a completely
  unnecessary hour's work to restore the damage.

  Obviously the collating sequence is aAbBcCdD... but that really does
  *not* make it right. Other linux distros do not have this problem, but
  then they seem to set:

  export LC_COLLATE=C

  as standard,  which is missing in a standard ubuntu installation
  (6.06lts -> 7.04)

  That *is* the work around, but I would respectfully suggest that you
  set it as standard before someone destroys something irreplaceable!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/120687/+subscriptions




More information about the foundations-bugs mailing list