More merge base discussion [was: monotone's LCA+DOM algo for selecting a merge base]

John A Meinel john at arbash-meinel.com
Mon Feb 20 18:10:33 GMT 2006


Aaron Bentley wrote:
> John A Meinel wrote:
>>> Isn't it true that, any time you get this pattern:
>>> http://aaronbentley.com/pastegraph.cgi?n=140
>>>
>>> Where we both merge a common revision, then LCA+DOM will pick the
>>> ultimate parent (A).
>>> When you probably want it to pick (J).
> 
> Yes.  And worse, a la http://aaronbentley.com/pastegraph.cgi?n=136
> 
>>> I don't know if it is quite as obvious what to do when the ancestries
>>> are longer. But what if we had a shortcut for *immediate* ancestors.
> 
> Could you expand on this?

This is the same idea as my 'last merged' concept.

> 
>>> Or maybe we introduce the concept of 'last merged' revisions. So even if
>>> you have this:
>>>
>>> http://aaronbentley.com/pastegraph.cgi?n=141
>>>
>>> You still have a 'last merged' revisions of B+J for the left side, and
>>> J+D for the right side.
> 
>>> So it wouldn't have to be a dominator if it is present in the
>>> 'last-merged' revisions of both sides.
> 
> Hmm.  Interesting.
> 
>>> Now, doing this:
>>> http://aaronbentley.com/pastegraph.cgi?n=143
>>>
>>> Would mean that we stop picking 'J' and revert back to the only
>>> dominator "A".
>>> But that seems reasonable considering the topology.
> 
> I don't understand why that would be reasonable.
> 
> Where I'm getting to is thinking that our current algorithm is better
> than LCA+DOM, except in its handling of criss-cross.  So maybe the best
> course is to:
> 
> 1. update it to skip criss-cross merge bases
> 2. work on caching strategies so that it doesn't take so long.  With
> proper caching, it should be a very cheap operation.
> 
> Aaron

As I understand it, the current algorithm uses the revision which is
farthest from 'null:' which is inside the ancestry of both sides. Is
that correct?

If it is, then a reasonable cache would be to just annotate each
revision with its maximum distance from 'null:'. This is fixed property
that only changes if ghosts are fleshed out. Which means that a cache is
perfect for it.

This has the behavior that it avoids shortcuts (like picking C from
http://aaronbentley.com/pastegraph.cgi?n=130).

What we haven't fully investigated, is that it preferentially picks
long-cuts.

Specifically, I noticed this behavior when working with bzr.dev. I can't
reproduce it now, but it was always preferentially selecting Robert's
integration branch, rather than a mainline bzr.dev revision, because
Robert had such a long integration history. I can't find a specific
instance right now, but something like the inverse problem of shortcuts.

We avoided LCA, because A->C->J cause A to look much closer than it
really should be. (A shortcut). The problem I was running into was that
it was picking A because the "C" path went through Robert's integration
branch, whereas I had just merged the other branch.

I wish I could come up with an example, I've tried a couple of times, an
in the end, it didn't work out.

I know it was an example where Robert had a large number of revisions in
his integration branch which were just coming in. And I was trying to
merge a branch I had just merged. I had just resolved all of the
conflicts from the integration branch, and the base it chose was inside
the integration branch again, and I had to resolve all of the conflicts
a second time.

If it happens again, I'll try to make sure to keep the revisions for
posterity.

John
=:->




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060220/b1f09efd/attachment.pgp 


More information about the bazaar mailing list