[win32] non-ascii/non-english file names: internal usage of file names

Jan Hudec bulb at ucw.cz
Thu Dec 1 10:10:14 GMT 2005


On Thu, Dec 01, 2005 at 11:47:35 +0200, Alexander Belchenko wrote:
> Jan Hudec ?????:
> >>I believe on windows (NT/XP) the real encoding is actually UTF-16, so it
> >>shouldn't be a problem there.
> >
> >I believe it actually depends on the filesystem type. 
> 
> No. I believe it is not:
> 
> >>>import sys
> >>>sys.getfilesystemencoding()
> 'mbcs'
> 
> >Ie. that they use
> >utf-16 on NTFS, but cp<whatever> on FAT. And they have two ways of
> >calling syscalls, one using cp<whatever> and another using utf-16.
> >But unices certainly do syscalls with whatever is locale, so legacy
> >unices do have problem there.
> 
> Internally Windows store filenames in mbcs encoding (multi-byte
> character set). Even on FAT32. I think it similar to Unicode. So Python
> has no problem to work with unicode filenames.

Check it and you are right. They store the short filenames in the legacy
codepage and the long filenames in unicode (not sure they support
surrogates thouhg).

-- 
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051201/009a0326/attachment.pgp 


More information about the bazaar mailing list