log._graph_view_revisions optimization

Tue Mar 23 08:00:18 GMT 2010

>>>>> "Parth" == Parth Malwankar <parth.malwankar at gmail.com> writes:

    Parth> Hello,
    Parth> In order to do some optimization of bzr-grep I noticed using
    Parth> --lsprof that about 45% of the time is spend getting the
    Parth> revision range tuple (revid, revno, merge_depth) for which
    Parth> bzr-grep uses bzrlib.log._graph_view_revisions

    Parth> Forcing it to a list (as its lazy):
    Parth>     given_revs = logcmd._graph_view_revisions(...)
    Parth>     given_revs = list(given_revs)

    Parth> Using the emacs repo as test data I found that around
    Parth> 6.5s are spent on this on my machine (last:5..last:1). The total
    Parth> grep time is ~15s with rev range.

This is most probably caused by calculating revnos (which in most
cases requires to load the whole graph).

As a side note, you'd better use
repo.iter_reverse_revision_history or
branch.iter_merge_sorted_revisions than _graph_view_revisions,
the former are public while the later is private.

    Parth> I don't know much about _graph_view_revisions but,
    Parth> interestingly, this time taken _seems_ to depend on
    Parth> the total history size rather than the range
    Parth> window. Doing (last:5..last:1) for a repo of ~100revs
    Parth> it takes only about 0.3s for hot cache.

See above, if you use iter_reverse_revision_history you will be
able to decide when to generate the revnos.

    Parth> I also noticed, _graph_view_revisions, _always_
    Parth> returns the merged revisions, e.g. 1.1.1, which I
    Parth> either user or reject depending on the --levels
    Parth> setting by the user. Can this be controlled (based on
    Parth> --levels) in case it gives a performance gain?

Yes, see remark above.

    Parth> For a single revision case (-r spec) I have added the
    Parth> optimization to bypass the _graph_view_revisions
    Parth> call. I am wondering if there is something that can be
    Parth> done to get a revision range faster.

It all boils done to how best calculate the revnos, doing so on
the mainline is trivial but as soon as you need to calculate a
dotted revno you almost always end up needing to load the whole
graph.

There are plans to make this better but in the meantine, I urge
you to use public methods or failing that, ask for the private
ones to be made public.

     Vincent