Incremental merge_sorted?

John Arbash Meinel john at
Sat Jun 17 21:39:48 BST 2006

Hash: SHA1

Aaron Bentley wrote:
> One of the new bottlenecks show_logs is merge_sorted.  It scales with
> the size of the revision graph.  So on my 1000-commit tree, lsprof says
> the _push_node and _pop_node operations take .0409 and .0286 s
> respectively, out of .2515 s spent in _show_log
> This is particularly silly for 'bzr log -r -5..', but also bad for the
> log|less case.
> Solving the problem for the '-r -5..' case could be done by trimming the
> graph.  But I don't see how to solve it in the general case, except by
> making the process incremental.
> In fact, my lsprof says that of .2515 seconds, .0968 are spent in
> get_revision_graph, and .1173 are spent in merge_sorted, so this
> topological sorting is dominating the log time.  Particularly nasty
> since two of our three log formatters suppress merges, anyhow.
> Aaron

First think you could do is change the code so that outside the loop it

show_merge = getattr(lf, 'show_merge', None)

Then you can do:

if show_merge is None:

  merge_sorted_revisions = [count, rev_id, 0, True
    for count, rev_id in enumerate(mainline_revs)]
  merge_sorted_revisions = merge_sort(...)

Then we don't have to call merge_sort for the shorter forms. And we
avoid the 'hasattr()' call in the middle of the loop.

The other possibility is that if you look at MergeSorter, merge_sort()
is just a wrapper around the MergeSorter.sorted() function call, which
is just a wrapper around iter_topo_order().

Which sounds to me like it should be possible to only iterate a few of them.
The only problem is that it has already touched all the revisions 1 time
before it starts yielding them.

I think the code makes some assumptions that you've seen all the nodes,
so that you don't have to do any more searching. I'm not sure how muh it
can be tweaked.

Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Mozilla -


More information about the bazaar mailing list