[BUG] Unicode string must be always used with encodings

John A Meinel john at arbash-meinel.com
Tue Sep 27 04:23:12 BST 2005


Robey Pointer wrote:
> 
> On 26 Sep 2005, at 13:00, Alexander Belchenko wrote:
> 
>> John A Meinel пишет:
>>
>>>> * for decode filenames to unicode strings it must be used  
>>>> user_encoding
>>>>
>>> I'm not sure about this last one. For instance, most Linux systems  
>>> use utf-8 as the encoding. And Windows uses UTF-16 (of which  python 
>>> doesn't seem able to read).
>>>
>>
>> When I print out os.listdir() list of one of my directory with  files 
>> that have russian filenames, I see that all filenames is flat  string, 
>> not unicode string. Based on this behaviour of my Python  2.4.0 I make 
>> last assumption. May be I am wrong, but now on my  system bzr is fails 
>> every time when I simply try to list with bzr  those directories.
> 
> 
> The trick is to use a unicode string for os.listdir:
> 
>  >>> os.listdir('/Users/robey/crap')
> ['rand\xc3\xb8m stuff']
>  >>> os.listdir(u'/Users/robey/crap')
> [u'rand\xf8m stuff']
> 
> robey

Thanks for the pointer. I knew that Windows supported unicode, and so 
did python, I just didn't know how to make it work.

So back to Alexander's comment. What is the actual error when you use 
bzr to list those directories? So that we can track it down, and make 
sure it works.

I'm guessing that Windows will return unicode strings, so that we don't 
have to worry about user encoding.

John
=:->



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050926/2037030c/attachment.pgp 


More information about the bazaar mailing list