the inventory must be updated as merge proceeds, not at the end
Denys Duchier
duchier at ps.uni-sb.de
Wed Dec 28 15:52:03 GMT 2005
As I was looking into fusing my computation of the resulting inventory
directly from the merge changeset with the algorithm of changeset
application, I finally had to take a look at conflict handling.
Perhaps this is not news to those in the know, but conflict handling
during merge suffers from a conceptual flaw. Let me explain: merge
essentially happens in 2 phases:
1. delete / move from src to tmp
2. create / move from tmp to dst
Name conflicts can be discovered during phase 2. The merge conflict
handler attempts to resolve a name conflict by moving the existing dst
file away (adds as many ".moved" suffixes as nec) to make room for the
new file with the same name. If the file that is being moved away is
versioned, the handler attempts to fix the inventory to reflect the
name change (otherwise, we'd loose track of where things are).
Unfortunately, none of the filesystem surgery performed during merge
is, at that point, reflected in the inventory. So, when the conflict
handler attempts to fix the inventory, the latter may no longer be in
synch with the tree.
As a consequence of the desynchronization between inventory and
filesystem, the following 2 kinds of incoherent situations may arise:
1. name conflicts in the filesystem not attested in the inventory
2. name conflicts in the inventory not attested in the filesystem
I believe I have devised examples of both kinds. Btw, before I start
with the examples, I'd like to say that I find it a little annoying
that the following should fail:
$ bzr init A
$ bzr branch A B
bzr: ERROR: exceptions.AssertionError: <type 'NoneType'>
at /mnt/kubuntu/home/duchier/src/bzr.dev/bzrlib/branch.py line 928
in get_inventory_xml
1. NAME CONFLICT IN THE FILESYSTEM NOT ATTESTED IN THE INVENTORY
$ bzr init A
$ cd A
$ mkdir a
$ mkdir b
$ bzr add a b
added a
added b
$ bzr commit -m "added a/ and b/"
Committed revision 1.
$ cd ..
$ bzr branch A B
preparing to copy: .
copy-to: ..
copy: .
Branched 1 revision(s).
$ cd A
$ touch a/file
$ bzr add a/file
added a/file
$ bzr commit -m "added a/file"
Committed revision 2.
$ cd ../B
$ bzr mv a b/a
a => b/a
$ touch b/a/file
$ bzr add b/a/file
added b/a/file
$ bzr commit -m "moved a/ to b/a/, added b/a/file"
Committed revision 2.
$ cd ../A
$ bzr merge ../B
get source ancestry: .
get destination ancestry: .
copy revision: .
bzr: WARNING: Moved existing /mnt/kubuntu/home/duchier/src/tmp/A/./b/a/file to /mnt/kubuntu/home/duchier/src/tmp/A/./b/a/file.moved
bzr: ERROR: b/a/file is already versioned
2. NAME CONFLICT IN THE INVENTORY NOT ATTESTED IN THE FILESYSTEM
$ bzr init A
$ cd A
$ touch bogus
$ bzr add bogus
added bogus
$ bzr commit -m "cannot branch empty tree"
Committed revision 1.
$ cd ..
$ bzr branch A B
preparing to copy: .
copy-to: ..
copy: .
Branched 1 revision(s).
$ cd A
$ touch file
$ bzr add file
added file
$ bzr commit -m "added file"
Committed revision 2.
$ cd ../B
$ touch file
$ bzr add file
added file
$ bzr commit -m "added (other) file"
Committed revision 2.
$ cd ../A
$ rm file
$ bzr merge ../B
get source ancestry: .
get destination ancestry: .
copy revision: .
bzr: ERROR: file is already versioned
My conclusion is that the surgical operations performed on the tree
during merge ought to be made into "tree" operations, i.e. that affect
both the filesystem and the inventory so that both remain accurately
in synch at all times. To this end, it becomes necessary to give
official standing to the idea of "moving things into and out of temp".
The idea that I started to explore is to extend the workingtree
abstraction with a notion of limbo mediated by the following two
operations:
limbo_put(self, file_id)
excise the subtree rooted at the entry identified by file_id
and stash it away in limbo
limbo_get(self, file_id, parent_id, name, conflict_handler)
take the subtree stashed away in limbo and identified by
file_id, and splice it into the working tree under the entry
identified by parent_id, and the given name
In my prototype, the tmp directory associated with limbo resides in
.bzr/limbo and I use file_ids as filenames in that directory.
Comments?
Cheers,
--Denys
More information about the bazaar
mailing list