Unicode Normalization
John Arbash Meinel
john at arbash-meinel.com
Thu Jun 29 14:55:23 BST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robey Pointer wrote:
>
> On 26 Jun 2006, at 7:32, John Arbash Meinel wrote:
>
>> Right now, I think the best way to go would be to do something in
>> list_files, similar to how WorkingTree does it now for ignored files.
>>
>> Basically, you go through, and if you know a file is versioned, you just
>> return it. If it doesn't match the inventory, you check if it needs to
>> be normalized. And if the name changes, you then check again if it is
>> versioned, and then go on to check if it is ignored, etc.
>>
>> Does this seem reasonable? It adds an extra function call, and an if
>> statement to the list_files loop. Which I'm not super keen on (since it
>> affects initial 'add' performance).
>> But I think it has the least impact in the case that most of the files
>> are versioned, and most of them are not fancy unicode, while still
>> correctly handling filenames on all platforms.
>
> I'm basically +1 on this approach.
>
> Would this be a good way to handle case normalization too? On Mac and
> Windows, "README" and "ReadMe" are the same file: case is preserved but
> not significant. This has actually caused me a problem once or twice
> with files in other VCS. It'd be nice if bzr went "I don't know about
> ReadMe but README is versioned and you're on a mac so they're the same
> file."
>
> robey
That is a little bit trickier, since you would have to fix case for both
the inventory and for the filesystem.
But something like that should be possible.
I would like us to stay 'case-preserving' on case-insensitive
filesystems. I'm intentionally not being 'unicode-normalization-preserving'.
I think there might be something we can do. I'll probably work on it a
little this week, and if I get somewhere, we're hoping for it to make it
into 0.9.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEo9vLJdeBCYSNAAMRAuYwAKDSOrqZaYaSqI6cPP9FcMzcVMe0JwCfYDV7
dDH+YJeO7MnipRrG5c/OeC0=
=mxmd
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list