[PATCH] more work on merge

Sun Jul 10 18:13:18 BST 2005

Aaron Bentley wrote:
> John A Meinel wrote:
>
>>>Aaron Bentley wrote:
>>>
>>>
>>>>I've been working on computing the most recent common ancestor out of
>>>>all ancestors, but it's taking a while to get right.
>>>
>>>
>>>Did you see my work in the merge plugin? Is there a reason my method was
>>>not correct?
>
>
> Mostly, I forgot about it.  Tunnel-vision, you know.  Martin told me
> he'd implemented add_pending_merge(), and seemed to be holding back on
> applying it to the code because I was working on merge.  So I figured I
> should apply it, and I started writing the things I'd need.
>
>
>>>Perhaps it was just too slow.
>
>
> Slow but right is okay by me.  I was going to implement something
> potentially slow and then let someone write a faster implementation if &
> when it mattered.
>
>
>>>My method was to get the list
>>>of all revisions in the TO branch, and then search breadth-first into
>>>the FROM branch until I found a revision which existed in both.
>
>
> This sounds like a workable approach.  I don't think it detects the
> crossed-branches case, at least not out of the box.
>
> If A merges B, and B merges A, and then they both commit, then the next
> time you merge, you get two equally good candidate bases.  See this post
> for the same problem in an Arch context:
> http://lists.gnu.org/archive/html/gnu-arch-users/2004-09/msg00279.html
>
> I think the best thing to do in that case is to select an earlier base.
>  But this is the sort of situation that Codeville Merge handles gracefully.
>
> What I've been working on would be somewhat slower than your approach,
> but a bit more rigorous.  First getting the full ancestry of both A and
> B, finding their set intersection, and then picking the revision in the
> intersection that is closest to A and closest to B, if there is such a
> thing.

How about this, if you look at my current code, it traces back through
history, and keeps a 'set'.

Instead of using a set, use a dictionary, mapping rev_id => distance to
target.

Basically the same recursive (stack-based) search that exists in
"get_revision_set" in my changeset or merge code, but when it pops
something onto the stack, it also pops on the current distance.

Here is a rough adaptation of my merge code:
def get_revision_set(branch, target_rev_id=None):
    """Get the set of all revisions that are in the ancestry
    of this branch.
    """
    this_revs = {}
    if target_rev_id is None:
        to_search = [(branch.last_patch(), 0)]
    else:
        to_search = [(target_rev_id, 0)]

    while len(to_search) > 0:
        rev_id, distance = to_search.pop(0)
        if rev_id in this_revs and this_revs[rev_id] <= distance:
            continue
        this_revs[rev_id] = distance
        rev = branch.get_revision(rev_id)
	distance += 1
        for parent in rev.parents:
	    p_id = parent.revision_id
            if (p_id not in this_revs or this_revs[p_id] > distance):
                to_search.append((parent.revision_id, distance))
    return this_revs

With this code, there is a distance from each revision to the target
revision, defined as the minimum number of revisions to go inbetween.
This probably would work for what you are thinking.

I can't say it does the best for crossed history, and it still suffers
from a simple patch applied late causing a shortcut history.
I suppose a different possibility is to track all distances to a
revision, rather than just the shortest one, but that does imply having
to trace all routes through history.

Otherwise you could do some sort of "average" distance, rather than
minimum distance.

>
> Aaron

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050710/d885daf4/attachment.pgp