[win32] non-ascii/non-english file names: internal usage of file names
David Allouche
david at allouche.net
Wed Nov 30 13:33:38 GMT 2005
On Tue, 2005-11-29 at 08:23 -0600, John Arbash Meinel wrote:
> Alexander Belchenko wrote:
> > As I can see cardinal difference between Windows version of Python and
> > Linux/Cygwin Python in following fact: when you use flat string on
> > Windows for base part of file names then all derived file names is
> > always representing as flat string. On Linux/Cygwin as I can see in
> > situations when path cannot be represented as flat string (or in ascii
> > encoding?) it silently converted to unicode. As result we have different
> > behaviour with non-ascii (non-english) file names.
> >
> > For workaround of this incompatibility in bzrlib code always should use
> > unicode file paths for all operations. Key points here is default
> > directory values such '.' used in construction Branch object etc.
> >
>
> I agree that it should use unicode filenames internally at all times.
> Thanks for looking into this.
Except when it's not possible. I can trivially create a plausible
filename in unix that cannot be decoded to unicode in any meaningful
way.
For example:
u'/Utilisateurs/Édouard/'.encode('latin-1') +
u'docs/thèse.tex'.encode('utf-8')
Some systems consider file names as character strings (Windows?) others
consider file names as byte stream. You probably cannot get correct and
reliable behaviour for both if you do not acknowledge the discrepancy.
It's probably a reasonable requirement that the relative names of
version controlled files should be stored (and treated internally) as
unicode, but I do not think it's reasonable to require that all path
handling be done on unicode strings.
--
-- ddaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051130/d489b09c/attachment.pgp
More information about the bazaar
mailing list