Bazaar-NG traffic #2
Joel Rosdahl
joel at rosdahl.net
Tue Oct 11 16:46:43 BST 2005
Magnus Therning <magnus at therning.org> writes:
> On Tue, Oct 11, 2005 at 04:19:44PM +0200, Joel Rosdahl wrote:
>
> [...]
>> I have made some notes about how to use Unicode in a Python-based
>> project of mine:
>>
>> http://kofoto.rosdahl.net/trac/wiki/UnicodeInPython
>>
>> They may be useful for others too.
>
> Useful indeed.
>
> Note 2: os.listdir(u"path") returns Unicode strings for names that
> can be decoded with sys.getfilesystemencoding() but silently
> returns byte strings for names that can't be decoded. That is, the
> return value of os.listdir(u"path") is potentially a mixed list of
> Unicode and byte strings.
>
> Is the following non-ascii enough?
>
> $ ls | cat
> Hallå_där
> Köp_blåbär
>
> With unicode:
> [u'Hall\xe5_d\xe4r', u'K\xf6p_bl\xe5b\xe4r']
>
> Without:
> ['Hall\xc3\xa5_d\xc3\xa4r', 'K\xc3\xb6p_bl\xc3\xa5b\xc3\xa4r']
If you're trying to test my note about the mixed list of Unicode and
byte strings, then: no. :-)
Try this program:
===[cut here]=========================================================
import os
import shutil
os.mkdir("test")
os.chdir("test")
open("r\xe5ka", "w")
open(u"r\xe4v", "w")
print os.listdir(u".")
os.chdir("..")
shutil.rmtree("test")
===[cut here]=========================================================
In a UTF-8 environment, the above program will print this:
['r\xe5ka', u'r\xe4v']
In an ISO-8859-1 environment, the program will print this:
[u'r\xe5ka', u'r\xe4v']
> Hej Joel!
Hallå Magnus!
--
Regards,
Joel Rosdahl <joel at rosdahl.net>
Key BB845E97; fingerprint 9F4B D780 6EF4 5700 778D 8B22 0064 F9FF BB84 5E97
More information about the bazaar
mailing list