[BUG] Unicode string must be always used with encodings
Alexander Belchenko
bialix at ukr.net
Sun Sep 25 15:06:48 BST 2005
There is bunch of errors with unicode strings in bzr that exists because
used default 'ascii' encoding for decode and encode. But this is bad,
because bzr fails when works with russian (per example) filenames.
I propose always use specific encoding to current user.
For decoding flat string into unicode need to be used user_encoding,
that defined in bzrlib/__init__.py as:
import locale
user_encoding = locale.getpreferredencoding() or 'latin-1'
(I change default 'ascii' string to 'latin-1' because it will works for
~80%...90% of all users)
For encoding unicode strings to flat string we need to use this encoding:
import sys
stdout_encoding = sys.stdout.encoding or 'ascii'
if stdout_encoding == 'ascii':
stdout_encoding = user_encoding
We must try to define output encoding in this way because on my Russian
version of Windows system encoding is 'cp1251' but encoding of console
is 'cp866'.
Alexander.
More information about the bazaar
mailing list