Bazaar-NG traffic #2
David Allouche
david at allouche.net
Wed Oct 12 23:51:16 BST 2005
On Wed, 2005-10-12 at 13:17 -0700, Robey Pointer wrote:
> I think if a filename comes back as a string instead of unicode, it's
> because python couldn't decode it using the filesystem's encoding.
> (AFAIK this is mostly a unix problem.*) In that case if you just
> pretend the filename is in Latin-1, you will preserve the gibberish
> filename: Latin-1 defines a unicode char for every possible byte
> 0-255, so it's non-lossy. The gibberish filename can be
> reconstituted as the same gibberish on the other end.
It's lossy.
Because by decoding as latin-1, then encoding to utf-8, lose the
information that "this file name is a byte stream, not a unicode
string".
In other words, you do not know which names would need to be "fixed",
the computer will no longer be able to make a difference between the
gibberish names and the meaningful ones.
If you want to preserve the gibberishness, you need to attach a metadata
bit to all file names.
--
-- ddaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051013/9d5fa891/attachment.pgp
More information about the bazaar
mailing list