Unicode through filesystem tricks

Alexander Belchenko bialix at ukr.net
Fri Jan 13 19:32:07 GMT 2006


John A Meinel пишет:
> I just found something interesting about Mac's filesystem, when dealing
> with unicode filenames.
> Specifically, we have this problem, if we create a file named:
> räksmörgås
> 
> This corresponds to the unicode string:
> u"r\xe4ksm\xf6rg\xe5s"
> 
> Where \xe4 is the letter 'a' with two dots on it.
> 
> However, the string we get back from the filesystem is:
> u"ra\u0308ksmo\u0308rga\u030as"

I test it on Windows. For Windows this names is not equivalent and also 
produce 2 different files:

Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> f = file(u"ra\u0308ksmo\u0308rga\u030as", 'wb')
 >>> f
<open file u'ra\u0308ksmo\u0308rga\u030as', mode 'wb' at 0x01021608>
 >>> f.write('spam')
 >>> f.close()
 >>> f = file(u"r\xe4ksm\xf6rg\xe5s", 'rb')
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
IOError: [Errno 2] No such file or directory: u'r\xe4ksm\xf6rg\xe5s'


--
Alexander





More information about the bazaar mailing list