[RFC] case sensitivity on Windows

Mark Hammond mhammond at skippinet.com.au
Thu Oct 30 00:01:04 GMT 2008


Sorry for the delay - I've been playing with this and making some progress, and it gets tricky fairly quickly :)  I thought I'd share my progress and ask a couple of questions:

> So for something like "add" I would like a WorkingTree api for
> canonicalizing the path that we are given. 'smart_add' has a nice
> location for doing this when we:
> # validate user file paths and convert all paths to tree ...

I've come up with a couple of new functions - osutils.canonical_relpath() and WorkingTree.canonical_relpath().  It works exactly the same as relpath, but any parts of the returned relative string which already exist are normalized to how they exist on the file-system.  As the attached patch mentions in an XXX comment, it could be optimized by using win32api.FindFiles, but its probably better to have unoptimized-but-correct functionality than undesirable functionality (and specifically, it wasn't clear how to hook an 'optional' win32api function via win32utils into this purpose and I didn't want to get bogged down by details.)  As John mentioned, smart_add has a nice place for this, and for many other functions safe_relpath_files() in builtins.py is convenient (but not entirely correct)

It addresses all the use-cases in http://bazaar-vcs.org/CasePreservingWorkingTreeUseCases (and adds blackbox tests for them) - except the last - handling renames... 

> As for the rest of the time...
> 
> 'iter_changes' would need a way to track which files 'missed' on the
> first pass, and then come back and correct it. We have a slight
> advantage that we generally work in a directory-at-a-time, though there
> are still problems. I believe the output order is defined as
> sorted-per-directory, so you can't yield anything once you have a miss,
> until you get to the end and resolve whether it was a *real* miss or
> not.

This is to handle the last 2 paragraphs in http://bazaar-vcs.org/CasePreservingWorkingTreeUseCases?  IOW, handling foo 'unintentionally' being renamed to Foo (ie, your example below, but assuming the mv was 'accidental' - eg due to the user specifying an 'incorrect' case to some other tool)?
 
> touch foo
> bzr add foo
> mv foo Foo
> bzr add Foo
> 
> doesn't try to add Foo twice even though it has a different name now.

[I'm actually surprised workingtree's 'case_sensitive' attribute doesn't prevent this already - it may be that attribute is only used during merges - but that's a bit of a digression for now]

Right.  This is quite possible in the 'accidental' rename case and 'bzr add' wasn't given any args.  I'm starting to run into issues where I'm not clear of the desired semantics in this situation and it would help greatly to clarify them.

Off the top my my head:

(the above) 
% touch foo
% bzr add foo
% mv foo Foo
% bzr add Foo
=> '' (ie, bzr should ignore the filesystem case for an existing entry)
 
(the above, but the last 'add' is implicit)
% touch foo
% bzr add foo
% mv foo Foo
% bzr add
=> '' (ie, bzr should ignore the filesystem case for an existing entry)
% bzr diff/status
=> '' (ie, nothing has changed)

Similarly if the *contents* of 'foo' changes along with the case:
% touch foo
% bzr add foo
% mv foo Foo
% echo hello > Foo
% bzr diff/status
=> 'reports that "foo" has changed'

But the user can explicitly rename either via bzr:
% touch foo
% bzr add foo
% bzr mv foo Foo
=> 'renamed foo -> Foo' - and as usual, bzr will have done a rename on the file-system.

Or manually before telling bzr:
% touch foo
% bzr add foo
% mv foo Foo
% bzr mv foo Foo
=> 'renamed foo -> Foo' (bzr doesn't need to touch the file-system)

But to be consistent with the other scenarios where the file-system case doesn't match the command-line:
 
% touch foo
% bzr add foo
% mv foo FOO
% bzr mv foo Foo
=> 'renamed foo -> FOO' (ie, bzr used the FOO on the file-system in preference to Foo on the cmdline).
Indeed, in the above example 'bzr mv foo foo' would also be interpreted as 'mv foo FOO'

But we could (its not clear if we *should*) insist the case for the *source* of the move exactly matches the inventory:
% touch foo
% bzr add foo
% mv foo FOO
% bzr mv FOO Foo
=> 'ERROR: no entry FOO' (this possibly works already)

So I guess I'm asking for 3 things (all of which can be done independently :)

* Comments and general review of the patch as it stands and with the limitations described above.

* Agreement/modifications to the scenarios above.

* Suggestions for moving forward.  Publish the branch at launchpad and hope to get some collaborators?

I hope this all makes sense after the 50th attempt at writing it :)

Thanks,

Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cicp_filesystem.patch
Type: application/octet-stream
Size: 18625 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081030/50dbd3ed/attachment-0001.obj 


More information about the bazaar mailing list