trac-bzr performance

Fri Nov 27 21:49:58 GMT 2009

Matt Nordhoff wrote:
> Martin von Gagern wrote:
>> Aaron Bentley wrote:
>>>> I fear so. On the other hand, there are quite a lot of parts in trac-bzr
>>>> which call Repository.get_ancestry
>>> This kind of thing is why I abandoned trac-bzr.  I think Trac has a
>>> serious impedence mismatch with bzr.  Any operation that scales with
>>> whole history is bad.  That's why we have iter_reverse_revision_history,
>>> and why _lefthand_history is considered "evil".
>>>
>>>> I assume the cost for the generation of dotted revision
>>>> numbers is on a similar scale. Therefore it should be possible to
>>>> achieve simple calculations on the complete parent map in a comparable
>>>> timeframe. 
>>> Yes, but it does not scale.
>> I agree that operations scaling by whole history should be avoided if at
>> all possible. I'm not sure this is inherently impossible with Trac. I'm
>> planning to get rid of Repository.get_ancestry in as many places as
>> possible, maybe all.
>>
>> On the other hand, I wonder how other apps cope with these issues.
>> Loggerhead for example uses dotted revnos as well, and also prints
>> information about when a rhs changeset was merged into mainline. Haven't
>> looked at the sources yet but I can't help to wonder how these things
>> are possible without potentially inspecting the whole parent map.
>>
>> Or is it that in theory the whole history would need to be inspected,
>> but that in practice the modifications users are likely interested in
>> are somewhere near the top of the tree and therefore can be answered
>> cheaply, while only ancient revisions would require a deeper and more
>> expensive look?
>>
>> Any pointers at relevant bzr api functions would be appreciated!
>>
>> Greetings,
>>  Martin
> 
> Loggerhead generates and caches a copy of the entire history [1], so it
> probably won't help you much.

I should've said "entire revision graph".

> This probably isn't such a bad idea in principle (if you have tons and
> tons of RAM), but it takes up to 10-15 seconds to generate on really
> large branches (Launchpad, MySQL), and Loggerhead also serializes them
> to the disk [2], only holding a few in RAM at one time to save, uh, RAM.
> Plus generating a bunch of them at one time can slow it to a crawl. [3]
> 
> [1]
> <http://bazaar.launchpad.net/~loggerhead-team/loggerhead/trunk-rich/annotate/head%3A/loggerhead/wholehistory.py#L41>
> (short URL: <http://xrl.us/bgd647>)
> 
> [2]
> <http://bazaar.launchpad.net/~loggerhead-team/loggerhead/trunk-rich/annotate/head%3A/loggerhead/changecache.py#L120>
> (short URL: <http://xrl.us/bgd66c>)
> 
> [3] <https://bugs.launchpad.net/bugs/118625>
-- 
Matt Nordhoff