[RFC] decoding environment variables to unicode?

Alexander Belchenko bialix at ukr.net
Tue Sep 4 08:50:24 BST 2007

Martin Pool пишет:
> On 9/4/07, Alexander Belchenko <bialix at ukr.net> wrote:
>> My patch for #131100 reveal one hidden bug with show_version.
>> If I have BZR_HOME env variable set to non-ascii value,
>> then current bzr.dev show it as is (and probably in wrong encoding),
>> but with my patch it throws UnicodeDecodeError, because config.config_dir()
>> returns non-ascii plain string. This is very rare case, and I stepped
>> into it only because I try to unicodize BZR_HOME in test cases.
>> In module win32utils I have special function _ensure_unicode()
>> that decode plain strings to unicode with user encoding.
>> What the best way to handle this case:
>> 1) convert location of config dir to unicode in config_dir() function?
>> But how to handle non-windows platforms? Is it OK to use _ensure_unicode
>> for them?
>> 2) convert location of config dir to unicode only in show_version()
>> function? Is it OK to use win32utils._ensure_unicode on non-win32
>> platforms? (in this case I need to move this function to osutils, IMO).
>> 3) ignore this case until bug report from real users will be filed?
>> I think that variant 1 is wrong and vulnerable on non-win32 platforms.
>> And variant 3 here only because I think it's very very very rare case.
> In general we keep paths in memory as unicode, and #1 is consistent
> with that approach.  I guess this will be a problem if the user's
> $HOME or similar is in an encoding we can't understand, but that does
> work when passed to filesystem functions.  I think you should try to
> decode it using the user encoding.

Actually, paths constructed with env variables are exclusion here:
no special converting to unicode is performed. That's why I sent
this RFC.

