UnicodeEncodeError in add_action_print with non ascii files names

Nir Soffer nirs at freeshell.org
Sun Feb 5 00:47:27 GMT 2006


On 5 Feb, 2006, at 1:24, John A Meinel wrote:

> The tricky part with filenames is that Mac OSX (which I use)
> normalizes unicode filenames in an odd way. So we need to be able to
> re-normalize them internally.)

If foo is a file name, foo may not be equal to 
foo.decode('utf-8').encode('utf-8') ?

Using unicode file names seems to work here on 10.3.9:

 >>> hebrew_name = '\327\251\327\234\327\225\327\235'.decode('utf-8')
 >>> file(hebrew_name, 'w').write('')
 >>> os.listdir(u'.')
[u'.DS_Store', u'\u05e9\u05dc\u05d5\u05dd']

Strangely os.path thinks it does not :-)

 >>> os.path.supports_unicode_filenames
False

I guess that using PyObjC will solve such problems:

 >>> from Foundation import *
 >>> NSString.stringWithString_(hebrew_name).fileSystemRepresentation()
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'

Although It seems to be the same as Python utf-8 encoding:

 >>> hebrew_name.encode('utf-8')
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'

There is also Carbon.File, included in the standard library:

 >>> from Carbon import File
 >>> File.FSRef(hebrew_name).as_pathname()
'/Volumes/Home/nir/Desktop/utest/\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'

BTW,  Carbon.File.FSRef().as_pathname() is used by MoinMoin to get the 
real name of files, which solve annoying problems with PageName and 
pagename, both seems to exists using os.path.exists(), although only 
one of them can exists on HFS[+] files system.
	

Best Regards,

Nir Soffer





More information about the bazaar mailing list