[win32] non-ascii/non-english file names: internal usage of file names

Jan Hudec bulb at ucw.cz
Wed Nov 30 08:07:53 GMT 2005


On Wed, Nov 30, 2005 at 10:46:18 +1100, Andrew Bennetts wrote:
> On Wed, Nov 30, 2005 at 01:20:51AM +0200, Alexander Belchenko wrote:
> [...]
> > >Or if it is critical that the output is correct, then it will use
> > >encode(txt, 'error'), which will throw an exception if a character
> > >cannot be encoded properly.
> > 
> > But for purpose of editing commit message and show to user status of 
> > tree this approach is bad: nor 'replace' nor 'error' will not desired.
> 
> If some text you want to display cannot be encoded in the console's encoding,
> you have no choice.  Probably an uncommon situation, but definitely possible.
> 
> > Native python implementation of StringIO (not cStringIO) accept 
> > ascii-strings or unicode-strings. So, we can use latter form.
> 
> I don't think it's wise to rely on differences between StringIO vs. cStringIO --
> they're probably accidental, and likely to change in future versions of Python.
> 
> Treat files, including [c]StringIO instances, as byte-streams, and explicitly
> encode unicode when writing to them.  The codecs.EncodedFile wrapper makes this
> easy.

Hm, that sounds to be the right way to do all the IO (tcl (for
ages) and perl (since 5.8) have this capability built into every
stream and changeable on the fly). Unfortunately the documentation does
not say, what codecs.EncodedFile does on reading. IMHO correct behaviour
would be to always return unicode, decoding as necessary.

-- 
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051130/5171b5f6/attachment.pgp 


More information about the bazaar mailing list