Optimal Merge Base selection

Aaron Bentley aaron.bentley at utoronto.ca
Sun Jul 10 20:05:01 BST 2005


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> I've been thinking about how to select the merge base, and I think I'm
> realizing that it really doesn't matter all that much.

The older your base is, the greater the chance that you've introduced
subsequent changes.  Let's take a look at the history of a line

revision A-1: AAA
revision A-2: BBB
revision A-3: CCC
revision B-3: BBB

The idea is that we have branch B being branched from A-2.  B-3 is a
revision that didn't change the line we're interested in.  (Maybe it
changed some other line).

If you select A-2 as a base, your standard three-way logic produces
CCC.  This is because wrt A-2, B-3 is unchanged, so A-3 is taken.

If you select A-1 as a base, your standard three-way logic will produce
a conflict, because wrt A-1, both B-3 and A-3 are changes.

Let's look at another line in the same file:
revision A-1: DDD
revision A-2: EEE
revision A-3: EEE
revision B-3: EEE

Since neither A-3 nor B-3 changed this line, you can select A-1 or A-2
as a base and you'll get EEE.  If you use A-1, it's the "identical
changes introduced in THIS and OTHER" case.  If you use A-2, it's the
"all texts are identical" case.

So yes, you don't *have* to select the perfect base.  But the better the
base, the fewer conflicts you will encounter.

> Because if I
> cherrypick a change from someone else, that change will be in the text,
> and when I use diff3 or merge3.Merge3() if someone else did the
> cherrypick, then it will show up in both, and diff3 won't change
> anything. So with something like this layout:
> 
> Me:    A - B - C - D - E - F
> #      |             /   /
> Other: + - G - - - -    /
> #      |     \        /
> You:   + - H - I - - /
> 
> In this case, G is probably not as good of a merge base as A, but
> breadth-first-search would use G (to merge I into F).

This is actually one of the properties of Baz merge-- it can pick a
common base that's not in the revision history of either branch.

I'm not sure why you think it's not as good a base.  In my view, G is
the best-known base.

> So I was thinking, what about using depth-first search along the
> revision-history of each branch, compared to the complete tree of the
> other branch.
> 
> So the history for "Me" is A-F, the history for "You" is A,H,I, which
> means that A is our common ancestor.

I think that would work, if I agreed that G was a bad base.

> Another situation is something like this:
> 
> Me:    A - B - C - D - E - F
> #      |             /   /
> Other: + - G - - - -   /
> #          |         /
> You:       + - H - I
> 
> In this case, G probably is the optimal merge, but it is in the
> revision-history of "You".
> 
> This change would mean trees with no real common ancestry, which only
> merged from a common group would not find their commonality. I mean
> something like this:
> 
> Me:    A - B - C - D - E - F
> #            /           /
> Other: G - H           /
> #            \       /
> You:   J - K - L - M


I did not intend to allow this case.  My plan was that when you merged H
into C, H would not be considered an ancestor.

> In this case, it would not find H as a merge base, since it is not along
> the main line of development for either tree.
> Though how 'H' got merged in, considering it doesn't share history, is
> left as an exercise for the user. 

Not to mention the guy who's trying to implement this stuff :-)

> Say somehow Me and You both merged a
> library (Other) into our main program. And then You update the library,
> and I want to merge your changes. In arch this could be done with a "baz
> replay M", which would even ignore any changes in M that did not effect
> shared code. I don't really know what "baz merge" would do. And now that
> it is after 1am, I can't think what it should do.

To get the changes I introduced in M, you do baz merge M L.  However,
the current conflict handler throws an exception when you try to do a
merge and there's no file in THIS.  Suggestions for how to handle this
conflict are welcome.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFC0XFc0F+nu1YWqI0RAk88AKCCvkyJju7XRxhNeDnqaJi7Cgj9dwCfarJ0
ckHklm42xs9f/LGFz2VgWTU=
=uotn
-----END PGP SIGNATURE-----




More information about the bazaar mailing list