Bazaar-NG traffic #2
Magnus Therning
magnus at therning.org
Tue Oct 11 14:50:56 BST 2005
On Tue, Oct 11, 2005 at 08:33:17AM -0500, John A Meinel wrote:
>David Allouche wrote:
>> On Tue, 2005-10-11 at 03:21 -0400, James Blackwell wrote:
>>
>>>= _Always_ Unicode =
>>>
>>>The Unicode discussions continued this week. Last week, Alexander Belchenko
>>>referred to some bzr code that didn't handle Russian filenames properly.
>>>This week Belchenko followed up a couple more times without a response.
>>
>>
>> Something which has been somewhat nagging me...
>>
>> I would like if it were possible to have byte-stream file names. In some
>> situations (e.g. automated imports from CVS) you might end up with file
>> names contaning non-ASCII characters without any encoding information.
>> Trying to interpret those names as unicode is haphazard at best, and
>> likely incorrect.
>>
>> Generally, when getting data from legacy sources, you cannot expect to
>> have encoding information. I would like to read about how CVS handles
>> non-ascii file names, from people who have direct experience with that.
>>
>> To be honest we only once had non-ascii file names in source code
>> repositories in a few hundred mainline imports, but the number are
>> biased since we have been focusing on increasing the number of
>> successful imports, disregarding (numerous) import failures.
>
>Do you have any of these directories/files available? I would be
>curious what this returns:
>
>python -c "import os; print os.listdir(u'.')"
>versus
>python -c "import os; print os.listdir('.')"
>
>The first should try and interpret the names and return unicode, the
>second should just do ascii names (possibly just byte-stream names).
Both return a list containing all the files in the current dir on my
Linux machine. The first one is a list of unicode strings ([u'str1',
u'str2']). The second is a list of regular strings (['str1', 'str2']).
I.e. exactly what you predicted.
/M
--
Magnus Therning (OpenPGP: 0xAB4DFBA4)
magnus at therning.org
http://therning.org/magnus
Software is not manufactured, it is something you write and publish.
Keep Europe free from software patents, we do not want censorship
by patent law on written works.
The law does not allow me to testify on any aspect of the National
Security Agency, even to the Senate Intelligence Committee.
-- General Allen, Director of the NSA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051011/82539c1d/attachment.pgp
More information about the bazaar
mailing list