[BUG] Unicode string must be always used with encodings
John A Meinel
john at arbash-meinel.com
Tue Sep 27 04:23:12 BST 2005
Robey Pointer wrote:
>
> On 26 Sep 2005, at 13:00, Alexander Belchenko wrote:
>
>> John A Meinel пишет:
>>
>>>> * for decode filenames to unicode strings it must be used
>>>> user_encoding
>>>>
>>> I'm not sure about this last one. For instance, most Linux systems
>>> use utf-8 as the encoding. And Windows uses UTF-16 (of which python
>>> doesn't seem able to read).
>>>
>>
>> When I print out os.listdir() list of one of my directory with files
>> that have russian filenames, I see that all filenames is flat string,
>> not unicode string. Based on this behaviour of my Python 2.4.0 I make
>> last assumption. May be I am wrong, but now on my system bzr is fails
>> every time when I simply try to list with bzr those directories.
>
>
> The trick is to use a unicode string for os.listdir:
>
> >>> os.listdir('/Users/robey/crap')
> ['rand\xc3\xb8m stuff']
> >>> os.listdir(u'/Users/robey/crap')
> [u'rand\xf8m stuff']
>
> robey
Thanks for the pointer. I knew that Windows supported unicode, and so
did python, I just didn't know how to make it work.
So back to Alexander's comment. What is the actual error when you use
bzr to list those directories? So that we can track it down, and make
sure it works.
I'm guessing that Windows will return unicode strings, so that we don't
have to worry about user encoding.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050926/2037030c/attachment.pgp
More information about the bazaar
mailing list