[BUG] Unicode string must be always used with encodings

Alexander Belchenko bialix at ukr.net
Mon Sep 26 21:00:51 BST 2005


John A Meinel пишет:
>> * for decode filenames to unicode strings it must be used user_encoding
> 
> I'm not sure about this last one. For instance, most Linux systems use 
> utf-8 as the encoding. And Windows uses UTF-16 (of which python doesn't 
> seem able to read).

When I print out os.listdir() list of one of my directory with files 
that have russian filenames, I see that all filenames is flat string, 
not unicode string. Based on this behaviour of my Python 2.4.0 I make 
last assumption. May be I am wrong, but now on my system bzr is fails 
every time when I simply try to list with bzr those directories.

> I think we also need to go through and pay close attention to when we 
> use os.sep and when we use "/".
> 
> I would say that internally all paths should be "/" separated, and that 
> is how they should be referenced in any internal files. Though I believe 
> <inventory> doesn't care, since it doesn't write directory lists, it 
> just keeps a reference to the parent. And I'm not sure where else would 
> store full paths. (.bzr/parent could store either path, since it 
> shouldn't be copied out of the branch)
> 
> I know I prefer to give bzr commands using forward slashes, so I don't 
> think we can assume that all user input comes in with os.sep.

Yes, I agreed. Input must not to be os.sep-only. But output need to use 
os.sep, because this mess as following:

...
added dir\subdir/
...

is very far from beauty things.

Alexander.





More information about the bazaar mailing list