[MERGE][1.0] New annotate merge that performs well on packs

Robert Collins robertc at robertcollins.net
Sun Dec 2 20:20:38 GMT 2007


Thanks for doing this. I have a few questions/suggestions, but am happy
if you've already thought of them all for it to go straight in. I agree
that this is important to be merged onto the 1.0 branch.

bb:tweak

On Sun, 2007-12-02 at 13:42 -0500, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi all,
> 
> Annotate merge (--merge-type=weave) is our best solution for handling
> criss-cross merges.  But the current implementation performs horribly on
> packs.  Fortunately, we don't actually need all the information used by
> the current implementation.  This new implementation uses only the
> precise information we need, so it performs far, far better.
> 
> With knits, it is slightly slower.
> 
> Old implementation (best of 3):
> real    0m1.877s
> user    0m1.688s
> sys     0m0.164s
> 
> New implementation (best of 3):
> real    0m2.029s
> user    0m1.856s
> sys     0m0.136s
> 
> So that's 1.10x as slow.

I think that's tolerable. If you know where the extra 10% is coming
from, noting that down somewhere might be good,

> But on packs:
> That's 5x as fast!

Nice!

Now my questions/suggestions. Firstly, the new module. While we
shouldn't discard clarity of code layout, extra modules have a
performance cost - module count is the major factor in our startup time
for operations when new modules are loaded.

PlanMergeVersionedFile looks like something I would expect to find in
either merge.py or versionedfile.py, and for PlanMerge definitely I
would initially look in merge.py. So one thing to consider - putting the
two new classes into merge.py and possibly versionedfile.py.

Should we deprecate VersionedFile.plan_merge?

Should we deprecate merge._plan_annotate_merge?

There is no module docstring on plan_merge.py. If you remove
plan_merge.py as I suggest above this is clearly a no-op. Otherwise,
consider what someone doing 'pydoc bzrlib.plan_merge' should see. I'd
consider something like:

"""Support for 2-way merges of text files using minimal annotations.

This module is useful when implementing merge logic for text files. E.g.
a GUI merge might want to use this to access structured data about how
two text files may merge.

The general way to use this module is to create a new plan_file =
_PlanMergeVersionedFile(file_id, [versionedfile, versionedfile, ..])
and provide that when making a new PlanMerge(version_id, version_id,
plan_file). Finally call plan_merge() on the PlanMerge object to obtain
the result.
"""

Lastly, you've used public class names for the new objects, if you
intend them to be an external interface this is great - and you should
mention them in the NEWS file under INTERNALS, as well as noting a
reference towards them in the developer guide.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071203/56dc0432/attachment.pgp 


More information about the bazaar mailing list