bzr-email fails if committer has non-ascii gecos
Glenn Morris
rgm at gnu.org
Wed Aug 28 20:57:30 UTC 2013
Vincent Ladeuil wrote:
>> Urgh, you mean python can't decode utf8 if locales is not installed ?
>>
>> Could it be that python chokes on importing your source rather than not
>> being able to decode utf8 from an external file ?
>>
>> Do you still encounter the issue with:
[...]
> string = 'Bela\xc3\xafche'
That was a good suggestion, but I'm afraid it made no difference.
Yes, it seems python does need locales to be installed, and furthermore
for the LANG environment variable to be (eg) en_US.utf8 for
string.decode('utf-8') to work.
However, I still can't get bzr to work correctly.
I discovered that the gecos data actually seem to be in latin-1, not
utf-8. Trying to decode it in utf-8 fails with
UnicodeDecodeError: 'utf8' codec can't decode byte 0xef in position
12: invalid continuation byte
So I set LANG=en_US.ISO-8859-1, but
from bzrlib import osutils
print osutils.get_user_encoding()
still returns 'ascii'. Looking at what get_user_encoding does, the
following returns "ANSI_X3.4-1968":
import locale
print locale.nl_langinfo(locale.CODESET)
as does this:
print locale.getpreferredencoding(False)
But locale.getpreferredencoding(True) returns the correct "ISO-8859-1".
So I suppose I have to add a call to
locale.setlocale(locale.LC_ALL, "")
at the start of get_user_encoding, or something like that?
More information about the bazaar
mailing list