Bazaar-NG traffic #2

Robey Pointer robey at lag.net
Wed Oct 12 21:17:43 BST 2005


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 12 Oct 2005, at 0:44, Jan Hudec wrote:

> On Wed, Oct 12, 2005 at 09:39:43 +1000, Martin Pool wrote:
>
>> On 12/10/05, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
>>
>>
>>> I think we need to insist that any versioned files have unicode
>>> filenames.  Considering that a bytestring filename for one user  
>>> may be a
>>> unicode filename for another, I think we could have really ugly
>>> corner-case behaviour if we tried to support bytestring filenames.
>>>
>>
>> I agree.  It seems far more likely that people will want to have
>> filenames that are meaingful in some language than filenames that
>> cannot be printed in any language at all.  If they just have  
>> filenames
>> that are inconsistent with the standard encoding on their system then
>> that is a different and potentially soluble problem.
>>
>
> Well, it's pretty hard though. Any conversion you do on
> non-representible name will be non-reversible. So you need to store a
> demangle table.

Sorry to bounce in late:

I think if a filename comes back as a string instead of unicode, it's  
because python couldn't decode it using the filesystem's encoding.   
(AFAIK this is mostly a unix problem.*)  In that case if you just  
pretend the filename is in Latin-1, you will preserve the gibberish  
filename: Latin-1 defines a unicode char for every possible byte  
0-255, so it's non-lossy.  The gibberish filename can be  
reconstituted as the same gibberish on the other end.

robey

* For example, even though Linux defines ext2/ext3 filenames as being  
utf-8 encoded, I don't think anything ever enforces this.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)

iD8DBQFDTW9rQQDkKvyJ6cMRAj7kAJ0d+V38k6hXuG2TgFe/OIIv63GijwCguSh0
P/N+viwCiFf7ijEipRBZKcU=
=uJzN
-----END PGP SIGNATURE-----




More information about the bazaar mailing list