More merge base discussion [was: monotone's LCA+DOM algo for selecting a merge base]
Denys Duchier
duchier at ps.uni-sb.de
Tue Feb 21 20:55:56 GMT 2006
Today, while cleaning my apartment, I had some time to think about "merge base
selection", and I may have had a very moderate epiphany. I apologize in advance
if this was already clear to others (also, I haven't quite caught up with the
entire thread yet).
First, because it will be important for the argument that I develop below, I
must reiterate my view that the job of a merge point is solely to choose which
halfs of the hunk conflict pairs to keep. This is (I think) in contrast to the
view advanced by Aaron that merges are machine assisted edits of changes. If
you prefer, think of bzr merges as actually consisting of possibly two
successive revisions: the first one only chooses a hunk from every conflict
pair, the optional second one makes additional edits.
Second, I firmly adopt the point of view that merges are non-symmetric: you
merge OTHER into THIS. I believe this may be in disagrement with the idea
advanced by JAM that base selection should be symmetric.
As a consequence of this distinction, the revision graph contains two kinds of
arcs: "derivation" arcs (i.e. from left parent) and "merge" arcs (i.e. from
non-left parent: meaning right parent if there are two of them).
DEFINITIONS:
A "derivation" ancestor of R is an ancestor of R that is on R's main history
line.
A "merge" ancestor of R is an ancestor of R that is not on R's main history
line.
R0 is a merge ancestor of R if it is an ancestor of R such that all paths from
R0 to R traverse at least one merge arc. R0 is a derivation ancestor of R if
there exists at least one path from R0 to R that traverses only derivation arcs.
NO MISTAKEN UPDATES:
An essential condition, when picking a merge base (BASE), is that it should not
mistakenly overwrite a deliberate choice in THIS's history over OTHER's.
Consider the following example where derivation arcs are drawn solid and merge
arcs are drawn with asterisks:
A[123]
/ \
/ \
B[143] C[153]
| *|
| * |
| * |
| * |
D[143] |
| * |
| * |
| * |
| *|
| F[153]
| |
| |
| |
E[1430] G[1530]
THIS OTHER
Picking D as BASE would cause "4" to appear unchanged in THIS and (incorrectly)
changed in OTHER. Picking C would make it appear changed in THIS and unchanged
in OTHER (correct, no conflict to resolve). Picking A would make it appear
changed in both (correct, but conflict to resolve).
In general picking a BASE that is a merge ancestor of OTHER can lead to a
harmful false positive (here, I need to develop a full proof): i.e. it may
classify the hunk as unchanged in THIS (with respect to BASE) and changed in
OTHER, while actually, with respect to OTHER's history, the hunk is unchanged in
OTHER and changed in THIS. A false positive in the other direction (or both
directions) is harmless.
This means that BASE must be a derivation ancestor of OTHER that is also an
ancestor of THIS (note: this is true of a dominator).
CHOOSING THE BASE:
This second part of my argument needs even more work, but it goes like this. We
need to consider the origin of the hunk data: the BASE must be chosen so as to
be comparable with the origin of any potential hunk (conflict) data. The set of
possible origins are the common ancestor revisions of THIS and OTHER. Thus, we
want to choose a BASE that is a derivation ancestor of OTHER and on all paths
from the root to any common ancestor revision (a dominator satisfies that
condition). We also want to pick BASE as close to THIS and OTHER as possible
(to minimize spurious conflicts). Thus picking the closest dominator of OTHER
that is also an ancestor of THIS will do the trick.
What to you think? does this sound like I am on the right track?
Cheers,
--Denys
More information about the bazaar
mailing list