UnicodeEncodeError in add_action_print with non ascii files names
Nir Soffer
nirs at freeshell.org
Sun Feb 5 00:47:27 GMT 2006
On 5 Feb, 2006, at 1:24, John A Meinel wrote:
> The tricky part with filenames is that Mac OSX (which I use)
> normalizes unicode filenames in an odd way. So we need to be able to
> re-normalize them internally.)
If foo is a file name, foo may not be equal to
foo.decode('utf-8').encode('utf-8') ?
Using unicode file names seems to work here on 10.3.9:
>>> hebrew_name = '\327\251\327\234\327\225\327\235'.decode('utf-8')
>>> file(hebrew_name, 'w').write('')
>>> os.listdir(u'.')
[u'.DS_Store', u'\u05e9\u05dc\u05d5\u05dd']
Strangely os.path thinks it does not :-)
>>> os.path.supports_unicode_filenames
False
I guess that using PyObjC will solve such problems:
>>> from Foundation import *
>>> NSString.stringWithString_(hebrew_name).fileSystemRepresentation()
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
Although It seems to be the same as Python utf-8 encoding:
>>> hebrew_name.encode('utf-8')
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
There is also Carbon.File, included in the standard library:
>>> from Carbon import File
>>> File.FSRef(hebrew_name).as_pathname()
'/Volumes/Home/nir/Desktop/utest/\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
BTW, Carbon.File.FSRef().as_pathname() is used by MoinMoin to get the
real name of files, which solve annoying problems with PageName and
pagename, both seems to exists using os.path.exists(), although only
one of them can exists on HFS[+] files system.
Best Regards,
Nir Soffer
More information about the bazaar
mailing list