Semantics for file copy

Wed Aug 20 19:45:51 BST 2008

Hi!

I read http://bazaar-vcs.org/BzrFileCopies and the question therein.

$ bzr branch foo bar
$ cd foo
foo$ bzr copy a b
foo$ echo a >> a
foo$ echo b >> b
foo$ bzr commit -m "Cloned a to b"
foo$ cd ../bar
bar$ bzr mv a c
bar$ echo c >> c
bar$ bzr commit -m "Renamed a to c"
bar$ bzr merge ../foo

For the content:

I think I would agree with the attempted answer by PierreAntoineChampin 
that all changes applied to a common ancestor in one tree should be 
applied to all files derived from that ancestor in another tree upon 
merge. So both files should be in conflict, one containing a and c and 
the other b and c.

More formally: for every pair a and b of files with a common ancestor c, 
after the merge a should contain all changes applied to b since c, and b 
should contain all changes applied to a since then. This sounds a lot 
like a normal content merge, but due to the fact that here every file 
can be part of more than one pair, this means changes from possibly more 
than one line of modifications even during a single merge.

A special case might be useful during refactoring. When I copy a file, I 
usually remove about half the contents from each copy afterwards, as 
splitting the file is what I originally had in mind. In that case, each 
modification to the common ancestor (in branch bar) would be surrounded 
by blocks of unchanged code, and each such block would be present almost 
unchanged in one of the copies and completely removed in the other copy.

To handle this case more intelligently, any modification to the common 
ancestor should be considered independently. If it can be applied 
cleanly to one copy, and you know the surrounding block (say at least 
two lines to either side) have been completely removed from the other 
copy, then the modification should be dropped from the second copy 
without causing a conflict.

For the file names:

I would say that the resulting tree should contain the files c and b, 
and report a path conflict for b / c.

Consider the file name as an attribute associated with a file as an 
entity. For the file a, foo leaves the name unchanged while bar renames 
it to c. So there should be no conflict, and the resulting name should 
be c. For the file b, foo "changed" the name from a to b during the 
copy, whereas bar changed the name of the file to c. This is very 
similar to the kind of path conflicts already possible with current bzr 
if you replace the copy in the above example by a move.

Care would have to be taken about file names. Every file in conflict 
would have to have a unique name while being in conflict. While in the 
above example, substituting mv for copy would result in a file called 
"b" after the merge as needed for the copy case, I guess this wouldn't 
always play out so nicely. I doubt that there is an algorithm that could 
always select a feasible mix of file names from both branches, so bzr 
would have to be prepared to autogenerate file names when required.

Greetings,
  Martin von Gagern

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080820/b033f34a/attachment.pgp