[win32] non-ascii/non-english file names: internal usage of file names

John Arbash Meinel john at arbash-meinel.com
Tue Nov 29 14:23:20 GMT 2005


Alexander Belchenko wrote:
> As I can see cardinal difference between Windows version of Python and
> Linux/Cygwin Python in following fact: when you use flat string on
> Windows for base part of file names then all derived file names is
> always representing as flat string. On Linux/Cygwin as I can see in
> situations when path cannot be represented as flat string (or in ascii
> encoding?) it silently converted to unicode. As result we have different
> behaviour with non-ascii (non-english) file names.
> 
> For workaround of this incompatibility in bzrlib code always should use
> unicode file paths for all operations. Key points here is default
> directory values such '.' used in construction Branch object etc.
> 

I agree that it should use unicode filenames internally at all times.
Thanks for looking into this.

As far as the "StringIO" can't decode into ASCII, that is something
we've discussed.
Basically, there are commands which "must be correct" and commands which
"shouldn't fail". 'bzr commit' shouldn't fail because it can't display
the log correctly (hence it should use encode(foo, 'replace')), other
commands must not succeed with bogus output (possibly bzr diff, anything
that is writing into a control file, etc.), and those should use the
default encode(foo, 'error')). We just need to do more explicit encoding
calls.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051129/6ca621b9/attachment.pgp 


More information about the bazaar mailing list