Unicode Normalization
Robey Pointer
robey at lag.net
Thu Jun 29 07:31:10 BST 2006
On 26 Jun 2006, at 7:32, John Arbash Meinel wrote:
> Right now, I think the best way to go would be to do something in
> list_files, similar to how WorkingTree does it now for ignored files.
>
> Basically, you go through, and if you know a file is versioned, you
> just
> return it. If it doesn't match the inventory, you check if it needs to
> be normalized. And if the name changes, you then check again if it is
> versioned, and then go on to check if it is ignored, etc.
>
> Does this seem reasonable? It adds an extra function call, and an if
> statement to the list_files loop. Which I'm not super keen on
> (since it
> affects initial 'add' performance).
> But I think it has the least impact in the case that most of the files
> are versioned, and most of them are not fancy unicode, while still
> correctly handling filenames on all platforms.
I'm basically +1 on this approach.
Would this be a good way to handle case normalization too? On Mac
and Windows, "README" and "ReadMe" are the same file: case is
preserved but not significant. This has actually caused me a problem
once or twice with files in other VCS. It'd be nice if bzr went "I
don't know about ReadMe but README is versioned and you're on a mac
so they're the same file."
robey
More information about the bazaar
mailing list