[Bug 1646260] Re: Locale names should always include the codeset component
Gunnar Hjalmarsson
1646260 at bugs.launchpad.net
Sat Jun 2 13:46:31 UTC 2018
@Ćukasz: This issue keeps causing trouble for users.
https://askubuntu.com/q/1042915
Great if you have a chance to give it priority soon.
** Changed in: localechooser (Ubuntu)
Importance: Undecided => High
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to localechooser in Ubuntu.
https://bugs.launchpad.net/bugs/1646260
Title:
Locale names should always include the codeset component
Status in localechooser package in Ubuntu:
Confirmed
Status in ubiquity package in Ubuntu:
Confirmed
Bug description:
If you install Ubuntu in English with Tel Aviv as the timezone
location, the installer figures out that the applicable locale is
en_IL and adds the line
LANG="en_IL"
to /etc/default/locale.
en_IL is a perfectly fine locale name; actually it's *the* correct
name of the English/Israel locale for UTF-8 according to SUPPORTED.
However, Python does not agree. Python seems to generally presuppose
that locale names include the codeset component, even if it accepts
locale names without codeset if they are included in the hard coded
dictionary locale_alias in /usr/lib/python3.5/locale.py. However,
en_IL is a relatively new locale, and not (yet) included in
locale_alias:
gunnar at gunnar-ubuntu-current:~$ python3
Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
[GCC 6.2.0 20160927] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_CTYPE, 'en_IL')
'en_IL'
>>> mylocale = locale.getlocale(locale.LC_CTYPE)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.5/locale.py", line 577, in getlocale
return _parse_localename(localename)
File "/usr/lib/python3.5/locale.py", line 486, in _parse_localename
raise ValueError('unknown locale: %s' % localename)
ValueError: unknown locale: en_IL
>>> quit()
I got to know about this issue via <http://askubuntu.com/q/854950>.
Now, the problem is not limited to en_IL. New locales in glibc tend to
be UTF-8 only locales without the codeset included in their names in
SUPPORTED. glibc and Python will probably never be in sync.
One way to deal with this issue is to always add '.UTF-8' to such
locale names. For instance, 'en_IL.UTF-8' is understood by both glibc
and Python.
Probably this should be fixed in localechooser. Basically I'd like to
see a code snippet along these lines:
if [ "$LOCALE" = "${LOCALE%.*}" ]; then
LOCALE=$( echo $LOCALE | sed -r 's/([^@]+)/\1.UTF-8/' )
fi
I haven't prepared a patch, since I don't know where exactly it should
be inserted without breaking anything else. (Don't know how to test it
either.) Still hoping that somebody finds it important enough to fix.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/localechooser/+bug/1646260/+subscriptions
More information about the foundations-bugs
mailing list