[win32] non-ascii/non-english file names: internal usage of file names

Alexander Belchenko bialix at ukr.net
Tue Nov 29 23:20:51 GMT 2005


John Arbash Meinel пишет:
> The specific problem is that a StringIO has encoding of "ascii"
> (actually it has None, but that implies ascii). So there are a lot of
> things that won't encode to ascii.
> What we need to do is change the code that writes, so that it does its
> own encoding, and if the command isn't critical, it will use encode(txt,
> 'replace') which will put dummy characters if it can't encode something.
> Or if it is critical that the output is correct, then it will use
> encode(txt, 'error'), which will throw an exception if a character
> cannot be encoded properly.

But for purpose of editing commit message and show to user status of 
tree this approach is bad: nor 'replace' nor 'error' will not desired.

Native python implementation of StringIO (not cStringIO) accept 
ascii-strings or unicode-strings. So, we can use latter form.

>>Furthermore, lately I send patch (#27) that fix some encodings issues in
>>commit and log commands. And I give the example when system encoding and
>>console encoding may vary on windows machine (due to backward
>>compatibility of windows). That patch need to be taking into
>>consideration when above code chunk will be refactored: show_status
>>should be encoded by default with bzrlib.user_encoding not with
>>sys.stdout.encoding, I guess.
> 
> Well wouldn't the console encoding be the "correct" encoding for output
> (sys.stdout), since you are trying to display something. While if you
> are reading from a file, you might expect bzrlib.user_encoding.

Exactly. It's all in my patch.

> Now for the commit message, you probably want to put it out with the
> system encoding, because the user will edit it with a text editor, save
> it, and then we read it back in.

Sure.

Alexander





More information about the bazaar mailing list