Bazaar-NG traffic #2
Robey Pointer
robey at lag.net
Wed Oct 12 21:17:43 BST 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 12 Oct 2005, at 0:44, Jan Hudec wrote:
> On Wed, Oct 12, 2005 at 09:39:43 +1000, Martin Pool wrote:
>
>> On 12/10/05, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
>>
>>
>>> I think we need to insist that any versioned files have unicode
>>> filenames. Considering that a bytestring filename for one user
>>> may be a
>>> unicode filename for another, I think we could have really ugly
>>> corner-case behaviour if we tried to support bytestring filenames.
>>>
>>
>> I agree. It seems far more likely that people will want to have
>> filenames that are meaingful in some language than filenames that
>> cannot be printed in any language at all. If they just have
>> filenames
>> that are inconsistent with the standard encoding on their system then
>> that is a different and potentially soluble problem.
>>
>
> Well, it's pretty hard though. Any conversion you do on
> non-representible name will be non-reversible. So you need to store a
> demangle table.
Sorry to bounce in late:
I think if a filename comes back as a string instead of unicode, it's
because python couldn't decode it using the filesystem's encoding.
(AFAIK this is mostly a unix problem.*) In that case if you just
pretend the filename is in Latin-1, you will preserve the gibberish
filename: Latin-1 defines a unicode char for every possible byte
0-255, so it's non-lossy. The gibberish filename can be
reconstituted as the same gibberish on the other end.
robey
* For example, even though Linux defines ext2/ext3 filenames as being
utf-8 encoded, I don't think anything ever enforces this.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
iD8DBQFDTW9rQQDkKvyJ6cMRAj7kAJ0d+V38k6hXuG2TgFe/OIIv63GijwCguSh0
P/N+viwCiFf7ijEipRBZKcU=
=uJzN
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list