Bazaar-NG traffic #2

Magnus Therning magnus at therning.org
Tue Oct 11 15:55:05 BST 2005


On Tue, Oct 11, 2005 at 04:19:44PM +0200, Joel Rosdahl wrote:
>John A Meinel <john at arbash-meinel.com> writes:
>
>> Magnus Therning wrote:
>>
>>> On Tue, Oct 11, 2005 at 08:33:17AM -0500, John A Meinel wrote:
>>>
>>>> Do you have any of these directories/files available? I would be
>>>> curious what this returns:
>>>>
>>>> python -c "import os; print os.listdir(u'.')"
>>>> versus
>>>> python -c "import os; print os.listdir('.')"
>>>>
>>>> The first should try and interpret the names and return unicode,
>>>> the second should just do ascii names (possibly just byte-stream
>>>> names).
>>>
>>> Both return a list containing all the files in the current dir on
>>> my Linux machine. The first one is a list of unicode strings
>>> ([u'str1', u'str2']). The second is a list of regular strings
>>> (['str1', 'str2']). I.e. exactly what you predicted.
>>
>> Naturally, I would expect that. :) What I wanted to know was what
>> happens when you have non-ascii characters in that directory?
>> [...]
>
>os.listdir(u".") returns regular strings for names that can't be
>decoded using the filesystem encoding.
>
>I have made some notes about how to use Unicode in a Python-based
>project of mine:
>
>    http://kofoto.rosdahl.net/trac/wiki/UnicodeInPython
>
>They may be useful for others too.

Useful indeed.

 Note 2: os.listdir(u"path") returns Unicode strings for names that can
 be decoded with sys.getfilesystemencoding() but silently returns byte
 strings for names that can't be decoded. That is, the return value of
 os.listdir(u"path") is potentially a mixed list of Unicode and byte
 strings.


Is the following non-ascii enough?

 $ ls | cat
 Hallå_där
 Köp_blåbär

With unicode:
 [u'Hall\xe5_d\xe4r', u'K\xf6p_bl\xe5b\xe4r']

Without:
 ['Hall\xc3\xa5_d\xc3\xa4r', 'K\xc3\xb6p_bl\xc3\xa5b\xc3\xa4r']

/M

Hej Joel!
>Joel Rosdahl <joel at rosdahl.net>
>Key BB845E97; fingerprint 9F4B D780 6EF4 5700 778D  8B22 0064 F9FF BB84 5E97
>

-- 
Magnus Therning                    (OpenPGP: 0xAB4DFBA4)
magnus at therning.org
http://therning.org/magnus

Software is not manufactured, it is something you write and publish.
Keep Europe free from software patents, we do not want censorship
by patent law on written works.

lingon function calls do not have 'parameters' -- they have
'arguments' -- and they always win them.
     -- Things Likely to be Overheard If You Hire a Klingon Programmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051011/6262898c/attachment.pgp 


More information about the bazaar mailing list