log._graph_view_revisions optimization

Parth Malwankar parth.malwankar at gmail.com
Wed Mar 24 09:36:21 GMT 2010


On Tue, Mar 23, 2010 at 1:30 PM, Vincent Ladeuil <v.ladeuil+lp at free.fr> wrote:
>>>>>> "Parth" == Parth Malwankar <parth.malwankar at gmail.com> writes:
>
>    Parth> Hello,
>    Parth> In order to do some optimization of bzr-grep I noticed using
>    Parth> --lsprof that about 45% of the time is spend getting the
>    Parth> revision range tuple (revid, revno, merge_depth) for which
>    Parth> bzr-grep uses bzrlib.log._graph_view_revisions
>
>    Parth> Forcing it to a list (as its lazy):
>    Parth>     given_revs = logcmd._graph_view_revisions(...)
>    Parth>     given_revs = list(given_revs)
>
>    Parth> Using the emacs repo as test data I found that around
>    Parth> 6.5s are spent on this on my machine (last:5..last:1). The total
>    Parth> grep time is ~15s with rev range.
>
> This is most probably caused by calculating revnos (which in most
> cases requires to load the whole graph).
>
> As a side note, you'd better use
> repo.iter_reverse_revision_history or
> branch.iter_merge_sorted_revisions than _graph_view_revisions,
> the former are public while the later is private.
>
>    Parth> I don't know much about _graph_view_revisions but,
>    Parth> interestingly, this time taken _seems_ to depend on
>    Parth> the total history size rather than the range
>    Parth> window. Doing (last:5..last:1) for a repo of ~100revs
>    Parth> it takes only about 0.3s for hot cache.
>
> See above, if you use iter_reverse_revision_history you will be
> able to decide when to generate the revnos.
>

Thanks for the pointers Vincent. Using iter_reverse_revision_history
is indeed much faster. Now grepping the mainline for emacs repo
all files is down to ~9s from ~15s before this change. Right now I
use this once I am sure user wan't to grep only mainline and not
dotted revs.

>
> It all boils done to how best calculate the revnos, doing so on
> the mainline is trivial but as soon as you need to calculate a
> dotted revno you almost always end up needing to load the whole
> graph.
>
> There are plans to make this better but in the meantine, I urge
> you to use public methods or failing that, ask for the private
> ones to be made public.
>

For dotted revs I will also update grep to use
branch.iter_merge_sorted_revisions instead of log._graph_view_revisions
but I suppose the performance will not go up much there but that
should be ok.

I think now there are at least two cases (log and grep) where we
need revision number list. I am not sure if there are more commands
like this (merge?). I think it may be useful to have something like
log._linear_view_revisions and log._graph_view_revisions as bzrlib
public functions.

Regards,
Parth



More information about the bazaar mailing list