More merge base discussion [was: monotone's LCA+DOM algo for selecting a merge base]
John A Meinel
john at arbash-meinel.com
Mon Feb 20 18:10:33 GMT 2006
Aaron Bentley wrote:
> John A Meinel wrote:
>>> Isn't it true that, any time you get this pattern:
>>> http://aaronbentley.com/pastegraph.cgi?n=140
>>>
>>> Where we both merge a common revision, then LCA+DOM will pick the
>>> ultimate parent (A).
>>> When you probably want it to pick (J).
>
> Yes. And worse, a la http://aaronbentley.com/pastegraph.cgi?n=136
>
>>> I don't know if it is quite as obvious what to do when the ancestries
>>> are longer. But what if we had a shortcut for *immediate* ancestors.
>
> Could you expand on this?
This is the same idea as my 'last merged' concept.
>
>>> Or maybe we introduce the concept of 'last merged' revisions. So even if
>>> you have this:
>>>
>>> http://aaronbentley.com/pastegraph.cgi?n=141
>>>
>>> You still have a 'last merged' revisions of B+J for the left side, and
>>> J+D for the right side.
>
>>> So it wouldn't have to be a dominator if it is present in the
>>> 'last-merged' revisions of both sides.
>
> Hmm. Interesting.
>
>>> Now, doing this:
>>> http://aaronbentley.com/pastegraph.cgi?n=143
>>>
>>> Would mean that we stop picking 'J' and revert back to the only
>>> dominator "A".
>>> But that seems reasonable considering the topology.
>
> I don't understand why that would be reasonable.
>
> Where I'm getting to is thinking that our current algorithm is better
> than LCA+DOM, except in its handling of criss-cross. So maybe the best
> course is to:
>
> 1. update it to skip criss-cross merge bases
> 2. work on caching strategies so that it doesn't take so long. With
> proper caching, it should be a very cheap operation.
>
> Aaron
As I understand it, the current algorithm uses the revision which is
farthest from 'null:' which is inside the ancestry of both sides. Is
that correct?
If it is, then a reasonable cache would be to just annotate each
revision with its maximum distance from 'null:'. This is fixed property
that only changes if ghosts are fleshed out. Which means that a cache is
perfect for it.
This has the behavior that it avoids shortcuts (like picking C from
http://aaronbentley.com/pastegraph.cgi?n=130).
What we haven't fully investigated, is that it preferentially picks
long-cuts.
Specifically, I noticed this behavior when working with bzr.dev. I can't
reproduce it now, but it was always preferentially selecting Robert's
integration branch, rather than a mainline bzr.dev revision, because
Robert had such a long integration history. I can't find a specific
instance right now, but something like the inverse problem of shortcuts.
We avoided LCA, because A->C->J cause A to look much closer than it
really should be. (A shortcut). The problem I was running into was that
it was picking A because the "C" path went through Robert's integration
branch, whereas I had just merged the other branch.
I wish I could come up with an example, I've tried a couple of times, an
in the end, it didn't work out.
I know it was an example where Robert had a large number of revisions in
his integration branch which were just coming in. And I was trying to
merge a branch I had just merged. I had just resolved all of the
conflicts from the integration branch, and the base it chose was inside
the integration branch again, and I had to resolve all of the conflicts
a second time.
If it happens again, I'll try to make sure to keep the revisions for
posterity.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060220/b1f09efd/attachment.pgp
More information about the bazaar
mailing list