Greetings from London

Matthieu Moy Matthieu.Moy at imag.fr
Mon May 21 10:13:22 BST 2007


Lars Wirzenius <liw at liw.iki.fi> writes:

>>   But it does seem like this should be a compatability
>> knob.  I believe that out of Mac, Windows and Linux, only Linux supports
>> non-utf8 paths.
>
> I don't know about Mac OS X, but traditionally, any UNIX system has
> allowed almost arbitrary filenames, caring only about the codes for
> ASCII characters NUL, slash, and period. So I don't think Linux is the
> only one, these days. :)

Yes, but that's cheating ;-). In Unix, a filename is a sequence of
bytes, not characters. This is the application's responsibility to
display it correctly (and therefore take care of encoding).

Typically, if you have filenames with non-ascii characters, with
LANG=some-latin1-locale, and later change to LANG=some-UTF-8-locale,
the filesystem doesn't see a difference, but you'll get loads of
problems with applications. In particular, an application like bzr
that stores filenames (in UTF-8 IIRC) will think the files have moved.
What you probably expect from bzr is that a file named "éèâ" that you
commit on a UTF-8 machine will be checked-out as "éèâ" on a latin1
machine. And _that_ is non-trivial.

-- 
Matthieu



More information about the bazaar mailing list