'bzr branch -r -XXX foo bar' slow for long histories

Robert Collins robertc at robertcollins.net
Tue Mar 18 22:30:47 GMT 2008

On Tue, 2008-03-18 at 15:36 -0500, John Arbash Meinel wrote:
> So to get 87.5k parents from disk costs us 15s, returning 87.5k from a
> memory dict costs 2.5s, or about 6x faster.

GraphIndex caches in memory. We're not getting 87.5k from disk if we're
using the same graph index objects; and we should be - if we're not its
a bug.

> 3) Each call into _lookup_keys_via_location triggers 1 call into
> _resolve_references and 2 calls into _read_and_parse.

If the keys are parsed, or known not to be in the index, this should be
essentially a no-op.

> Both of those calls happen even if we have the nodes in memory (I
> think). It may be trivial and realize that there are no ranges.
> We don't spend a lot of time in open() seek() and read(). Though I
> have
> an interesting workaround that is able to save about 10% of the time
> by
> caching the files in an LRUCache.

This suggests we are doing too much disk IO. The all_packs() call starts
to sound suspicious; I'll have a peek at that today.


GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080319/959d4cc1/attachment.pgp 

More information about the bazaar mailing list