Unicode through filesystem tricks
Alexander Belchenko
bialix at ukr.net
Fri Jan 13 19:32:07 GMT 2006
John A Meinel пишет:
> I just found something interesting about Mac's filesystem, when dealing
> with unicode filenames.
> Specifically, we have this problem, if we create a file named:
> räksmörgås
>
> This corresponds to the unicode string:
> u"r\xe4ksm\xf6rg\xe5s"
>
> Where \xe4 is the letter 'a' with two dots on it.
>
> However, the string we get back from the filesystem is:
> u"ra\u0308ksmo\u0308rga\u030as"
I test it on Windows. For Windows this names is not equivalent and also
produce 2 different files:
Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> f = file(u"ra\u0308ksmo\u0308rga\u030as", 'wb')
>>> f
<open file u'ra\u0308ksmo\u0308rga\u030as', mode 'wb' at 0x01021608>
>>> f.write('spam')
>>> f.close()
>>> f = file(u"r\xe4ksm\xf6rg\xe5s", 'rb')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IOError: [Errno 2] No such file or directory: u'r\xe4ksm\xf6rg\xe5s'
--
Alexander
More information about the bazaar
mailing list