Optimal Merge Base selection

Aaron Bentley abentley at panoramicfeedback.com
Mon Jul 11 14:58:15 BST 2005

John A Meinel wrote:
> Martin Pool wrote:

>>This case works best when there is no overlap in the file names.  (It
>>might be nice if there was a way to merge the root of OTHER into a
>>subdirectory of THIS.)
> Couldn't this be handled with my TREE_ROOT gets a real id suggestion?

Yes, but that breaks the case where you actually want them in the same
directory.  Once we have file-id aliases, that would make sense.

> It actually puts the patches into a directory named something like
> "+patches-missing-files.23123"
> So it sort-of ignores them. I think it should print a warning, and put
> the patch somewhere, but not actually stop the merge/replay.

I think tla is somewhat broken here-- they should be treated as
conflicts, but they don't produce that status code and they don't halt

>>In arch, baz and bzr this will mean we record that replayed version as
>>merged, but in fact not all of the changes have been taken in.  That
>>might cause trouble later.
> How so? How is that different from doing a merge, and then editing out
> the parts that you don't like?

One problem is you might select the just-committed revision as a base,
and therefore not be able to do proper three-way-merging on the missing

> I realize that bzr makes a different definition of what merging a
> revision means, but it seems like at some point merges need to be edited
> to get them to fit into the local tree. So having 'missing' is not
> really different from having 'edited'.

Just in scale.  No one ever recommends making massive changes to the
tree before committing a merge, just whatever's necessary to make the
merge right.

> Now if bzr incorporates weave/codeville merging, then you need to be a
> little bit more careful about cherry-picking. Because you start
> detecting that a future diff over-rules the previous diff, so you need
> to make sure that you know where the original came from, to see if any
> new ones over-rule it.
> So a different track... How does Codeville handle branching ancestry? If
> you annotate each line with a number indicating which revision it was
> modified, you can easily see that 10 > 9 thus 10 should take priority.
> But with a truly distributed setup, don't you have the problem that it
> isn't obvious whether john at arbash-meinel.com-200513123123-aontehuntaho
> comes before or after mdp at sourcefrog.net-20051423423-aoehunnth?

Yeah, in distributed RCSes, you can't trust time.  All you can trust is
sequence.  So if A is descended from B, you know their sequence.  But
for two parallel branches, you can't know the sequence.

> Or is it that when you merge my changes, they get re-labeled with the
> revision number where they were merged, and you just don't worry that
> they came from me.

A good annotation algorithm would handle it thusly:
For revision C with parents A and B, see if the change was introduced
from A to C, or from B to C.

Compare with A | Compare with B| Really introduced in
C              |C              | C
A              |C              | A
C              |B              | B
A              |B              | A & B

As you can see above, it's possible for A and B to both introduce the
same change.  For instance, they may both have applied the same patch.
I'm not sure what the right way to handle that for merging purposes is.

> I might be way to tangential and lost in my own thoughts. I can't say
> that I have spent a long time thinking about Codeville merging, other
> than the cursory, "looks kind of interesting".

Well, Codeville does a couple of neat things:
1. establishes the identity of each line in the file, so that you don't
need context to get merging right.
2. uses ancestry instead of a 'base' revision in order to determine
which changes supercede one another.

But if course, you can still get conflicts, and it's just a text-based
merge.  Also, I doubt it handles any unit finer than 'per-line'.  Though
I suppose you could use any separator you liked, e.g. whitespace, to do
annotate on finer levels.

One thing I think no one's mentioned about Codeville merge is, I don't
think you need the entire ancestry.  I think you only need the
annotation up until the last common ancestor.  (The rest can just be set

If we only have to annotate a small number of revisions, we may not need
a weave-based format to do it speedily.  It also means we can annotate
with different parameters (e.g. 'ignore line-ending differences', 'break
at all whitespace and the following characters '().'), and that we can
accept relatively wasteful annotation representations, since they're
only temporary.

Aaron Bentley

More information about the bazaar mailing list