CHKMap usage - optimizations?

Jelmer Vernooij jelmer at samba.org
Thu May 14 12:22:25 BST 2009


Hi John,

On Wed, May 13, 2009 at 09:49:44AM -0500, John Arbash Meinel wrote:
> >>> Most of the time is spent in CHKMap at the moment:

> >>> CHKMap.apply_delta(): 77.63%
> >>>  * of which: Knit.add_lines(): 47.27%
> >>> CHKMap.iteritems(): 5.14%

> >>> What's the best way of improving this performance ? Switching to packs?

> >> They will be faster.

> > What's the best place to be looking to get started with this? Can you
> > give me a few hints?  bzrlib.plugins.search.index is a bit overwhelming..

> So I'm wondering if this isn't the problem that transactions no longer
> cache the knit object. So we might end up re-reading the entire .kndx
> for every insert. I'm not positive to this, but if it is true, there
> would be a huge boost just switching to packs. (They don't require
> reading the whole index to access one entry.)

> What are you using for the parent info for these entries? I'm just
> thinking that CHK doesn't hint to knits/packs how to delta compress, as
> we went with 'groupcompress' and hint a different way.

> So you might also look at how 'minimal' your deltas are. I know when I
> was tuning the converter, having a small delta was quite beneficial.
> (Often you can use different bases to generate a delta that results in
> the same final content. I don't know if that is relevant for you.)

This is the history of evolution, which is in Subversion and very
linear; I'm using the lhs parent always so I'm confident I'm using the 
right parents, at least for evolution.

> >>> Is there some way I can use transactions here?

> >> apply_delta is pretty efficient. You are perhaps suffering some
> >> thrashing on page re-reads, but the basic problem is likely to be kndx
> >> related.
> > The 30% that CHKMap.apply_delta() doesn't spend in Knit.add_lines()
> > seems to be time spent purely in apply_delta itself. 
> Sure, but if we make that take 0 time, it still is only 30% of your time
> spent...
It'll probably matter significantly more when the storage mechanism is
optimized, so I figured it was worth asking about 

> That said, there are a lot of places that could be the issue. Is there a
> chance to get a profile? (I prefer --lsprof-file foo.txt since I don't
> always have KCacheGrind available.)
Attached.

Cheers,

Jelmer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: evolution.txt.bz2
Type: application/octet-stream
Size: 29965 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090514/3c71cb1a/attachment-0001.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 315 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090514/3c71cb1a/attachment-0001.pgp 


More information about the bazaar mailing list