[MERGE] Implement Tree.plan_merge, deprecate RevisionTree.get_weave

Robert Collins robertc at robertcollins.net
Thu Jul 19 14:56:14 BST 2007


On Thu, 2007-07-19 at 08:40 -0400, Aaron Bentley wrote:
> Robert Collins wrote:
> > Robert Collins has voted +1 (conditional).
> > Status is now: Conditionally approved
> > Comment:
> > 
> > +            # Disable pending merges, to avoid affecting merge behavior
> > +            tree.set_parent_ids(parents[:1])
> > 
> > Could you explain more why this is needed?
> 
> plan_merge is affected by the list of ancestors.  If OTHER is listed as
> an ancestor of THIS, that means that all of the changes in OTHER have
> already been considered and rejected, so the merge doesn't need to do
> anything.
> 
> > It seems prone to losing a
> > users data e.g. during a power-off or kill -9 situation as well as
> > requiring more work at the tree layer to discard and recreate the cached
> > parent data.
> 
> Yeah, the question is whether we care enough in these situations:
> 
> 1. remerge is rarely used
> 2. user files will also be a mess in this situation, so it's only
>    incrementaly worse
> 3. if we do lose the parent data, users can just do "revert" and "merge"
>    instead.

I think it would be very surpising to a user, as they won't have any
reason to think this could have happened - they may happily finish their
merge and commit, but without the parent data - introducing conflicts.

On consideration I'd really rather see this controlled through a flag to
the function or something.

> > with our current API
> > 
> > +    def _get_ancestors(self, default_revision):
> > +        ancestors = set([default_revision])
> > +        for parent_id in self.get_parent_ids():
> > +            ancestors.update(self.branch.repository.get_ancestry(
> > +                             parent_id, topo_sorted=False))
> > +        return ancestors
> > 
> > will be faster as
> > +        ancestors = set([default_revision])
> > +
> > ancestors.update(self.branch.repository.get_revision_graph(self.get_parent_ids()))
> > 
> > +        return ancestors
> > 
> > because that will do just one index traversal.
> 
> True, but it's exceptional to perform a merge while there is more than
> one ancestor, and it seems clearer to use get_ancestry.

well both ancestors and get_revision_graph do full graph traversal, its
just that ancestors flattens the graph - personally I don't see using
get_ancestors in a loop as being clearer. But its definately much
slower.

On this point I'll leave it up to you - but ask that if you leave it as
get_ancestry with a loop that you have a XXX with the faster version and
a note (e.g. # XXX Using get_ancestry for clarity; if this becomes a
performance sticking point consider
self.branch.repository.get_revision_graph(self.get_parent_ids()))

> > Its interesting that we grab full ancestry here. This doesn't affect the
> > merge, but I wonder about giving a callback here, or even using the file
> > merge graph which will be much smaller to reduce the amount of index
> > looked at.
> 
> Not really necessary.  We really just need graph difference.

Do you mean 'we only need the revisions from the tips to the full set of
lcas ?' 

Cheers,
Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070719/a20cadd4/attachment.pgp 


More information about the bazaar mailing list