[MERGE] reconcile - aka fix - with tests

Robert Collins robertc at robertcollins.net
Thu Feb 23 21:01:47 GMT 2006


On Thu, 2006-02-23 at 09:29 -0600, John Arbash Meinel wrote:
> 
> This looks good. Though I thought we were looking to use the Testament
> sha1 in Revision entries.

I'm moving code around to make it reusable there, not changing semantics
- so the use of which sha1 is interesting-but-separate.

> In general, +1, though it would be nice if you checked to make sure a
> reweave was necessary before actually doing the reweave.

Yes, I can add that in.

> (And if you load the revisions ahead of time, you can do "topo_sort"
> to
> make sure they all get added ancestors first). 

I actually wrote a new topo sorter for this in a subsequent patch. Our
current topo_sort is rather inefficient:
we iterate over the whole graph pulling out childless nodes (which
removes them from the child list of their parents). 
If we have 2000 revisions in a line, thats 2000 list comprehensions to
reduce the graph. - so every node in the graph gets visited once for
each of its child layers - which is what, one check for the outermost
layer + one check of the (nodes - removed) + one check of the (nodes -
removed - removed)... in a pathological cases of 2000 nodes depending on
each other thats 2000 checks + 1999 checks + 1998 checks ... 1 check.
Obviously that grows as you get merges, but only proportional to the
depth of the graph?

Now, whats the sum of 1 + 2 + 3 .. + N 
Seem familiar ;) -> O((depth of graph)^2)

Now the topo sort I wrote for the updated reconcile patch attached
processes each node once and only once but does not produce the same
output as the current topo_graph: it will produce a topo sorted graph of
fairly arbitrary subgraphs - each node is depth first searched until an
emittable node (no parents not in the output) is found, and each merged
parent of a revision is checked once. It should perform in close to O(N)
even in the worst case.

topo_sort takes 4 seconds of wall time to sort the bzr graph.
the one in reconcile I haven't got an accurate time yet, I don't know
the python trick offhand to time a method - but its fast enough I can't
even count to '1 mississipi' before that phase completes.

Is this something thats able to replace the current topo_sort? What
attributes is topo_sort needed to have?

Its quite possible that using the current topo_sort is one of the
factors in making reweave slow.

Rob


-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fast-reconcile.patch
Type: text/x-patch
Size: 29147 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060224/88a27ab9/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060224/88a27ab9/attachment.pgp 


More information about the bazaar mailing list