Bazaar-NG traffic #2

David Allouche david at allouche.net
Wed Oct 12 23:51:16 BST 2005


On Wed, 2005-10-12 at 13:17 -0700, Robey Pointer wrote:
> I think if a filename comes back as a string instead of unicode, it's  
> because python couldn't decode it using the filesystem's encoding.   
> (AFAIK this is mostly a unix problem.*)  In that case if you just  
> pretend the filename is in Latin-1, you will preserve the gibberish  
> filename: Latin-1 defines a unicode char for every possible byte  
> 0-255, so it's non-lossy.  The gibberish filename can be  
> reconstituted as the same gibberish on the other end.

It's lossy.

Because by decoding as latin-1, then encoding to utf-8, lose the
information that "this file name is a byte stream, not a unicode
string".

In other words, you do not know which names would need to be "fixed",
the computer will no longer be able to make a difference between the
gibberish names and the meaningful ones.

If you want to preserve the gibberishness, you need to attach a metadata
bit to all file names.
-- 
                                                            -- ddaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051013/9d5fa891/attachment.pgp 


More information about the bazaar mailing list