[win32] non-ascii/non-english file names: internal usage of file names

Alexander Belchenko bialix at ukr.net
Wed Nov 30 11:21:20 GMT 2005


Andrew Bennetts пишет:
> On Wed, Nov 30, 2005 at 01:20:51AM +0200, Alexander Belchenko wrote:
> [...]
>>>Or if it is critical that the output is correct, then it will use
>>>encode(txt, 'error'), which will throw an exception if a character
>>>cannot be encoded properly.
>>
>>But for purpose of editing commit message and show to user status of 
>>tree this approach is bad: nor 'replace' nor 'error' will not desired.
> 
> If some text you want to display cannot be encoded in the console's encoding,
> you have no choice.  Probably an uncommon situation, but definitely possible.

I mean that limitation of StringIO plus replace/error encoding scheme is
not good combination when we want to show something to the user in their
text editor.

>>Native python implementation of StringIO (not cStringIO) accept 
>>ascii-strings or unicode-strings. So, we can use latter form.
> 
> I don't think it's wise to rely on differences between StringIO vs. cStringIO --
> they're probably accidental, and likely to change in future versions of Python.

I agree.

> Treat files, including [c]StringIO instances, as byte-streams, and explicitly
> encode unicode when writing to them.  The codecs.EncodedFile wrapper makes this
> easy.

It seems that you're right. Following code do what I need:

         input_encoding = input_encoding or bzrlib.user_encoding

         b, selected_list = branch_files(selected_list)
         if message is None and not file:
             import codecs
             catcher = StringIO()
             wrapper = codecs.getwriter(input_encoding)(catcher)
             show_status(b, specific_files=selected_list,
                         to_file=wrapper)
             message = edit_commit_message(catcher.getvalue())


I'll create corresponding patch.

Alexander






More information about the bazaar mailing list