[win32] non-ascii/non-english file names: internal usage of file names

Jan Hudec bulb at ucw.cz
Wed Nov 30 19:04:38 GMT 2005


On Wed, Nov 30, 2005 at 10:23:23 -0500, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> David Allouche wrote:
> > Except when it's not possible. I can trivially create a plausible
> > filename in unix that cannot be decoded to unicode in any meaningful
> > way.
> > 
> > For example:
> > 
> > u'/Utilisateurs/Édouard/'.encode('latin-1') +
> > u'docs/thèse.tex'.encode('utf-8')
> > 
> > Some systems consider file names as character strings (Windows?) others
> > consider file names as byte stream. You probably cannot get correct and
> > reliable behaviour for both if you do not acknowledge the discrepancy.
> 
> We can require that all files in a version-controlled directory have
> unicode-meaningful names.  I think that there are very few situations
> involving source code where totally arbitrary filenames are an advantage.

Converting filenames from local encoding to unicode is not a problem (as bzr
can always refuse to work if it is not possible). But it IS a problem the
other way round. Say someone on iso-8859-2 system creates a file named 'kř'
(k&#rcaron; for those who can't display that character). And someone else on
iso-8859-1 system tries to check it out. Then bzr should not just throw up
it's hands and say it's not possible.

> And if people scream, we can go to a more complex approach of requiring
> versioned files to be unicode, but not unversioned files in the tree.
> 
> And if people scream, we can find ways to jam binary data into unicode,
> in one of the user-defined sections.

Well, 'latin-1' can always be decoded to unicode, so that part is not too
hard.

-- 
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051130/10feac83/attachment.pgp 


More information about the bazaar mailing list