B+tree discussions

Wed Jul 2 22:40:29 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've done some more tweaking. And here are some interesting results:

Latest code is available at:
~  lp:~jameinel/+junk/bzr-index2

1) I added a value-level cache. So that when we read a leaf node, we
cache all the keys in that leaf node. This dramatically improved the
"iter_random_one" tests. Such that btree becomes *faster* than graph.

Some specific values:
~    6.006	graph
~  102.450	btree no key cache
~  108.513	btree 10,000 key cache
~    4.786	btree 100,000 key cache
~   51.803	btree no key cache, 200 leaf node cache
~    7.960	btree no key cache, 500 leaf node cache
~    7.613	btree no key cache, 1,000 leaf node cache
~    7.814	btree no key cache, 10,000 leaf node cache

With no cache, the 100 node leaf cache is about 50% efficient (it hits
50k times, and misses 54k times)
With a key cache of 10,000 entries, it hits 50% of the time (50.4k hits
vs 55k misses), but causes the leaf node cache to drop to 16k hits
versus 39k misses.
With a key cache of 100,000 entries, we hit 99% of the time (104k hits
versus 1k misses), but the leaf node cache only misses on all 1067 checks.
With a node cache of 500, 1,000, or 10,000 we hit 99% of the time, but
are half the speed of a key cache.
With a node cache of 200, we hit 77k versus 27k misses. It still costs
us a lot.

2) The LRUCache has a bit of overhead. Enough that doing (1) slowed down
things like search_from_revid.

3) The biggest penalty in graph.iter_ancestry() is that we search for
revision ids in indexes where it is not present. Specifically, we have
over 60,000 lookups that miss. Bloom only catches about 6,000 (10%) of that.
Getting rid of the leaf cache in favor of the line cache hurts a lot
here. Because we get to the leaf node, and have to read it again to
realize that the key we want isn't there.
I added the ability to cache that a revision_id isn't present (you add
_not_present_object into the cache). However, for iter_ancestry() we
don't actually revisit the same revisions multiple times. So it doesn't
improve things a lot. It did hit somewhat for get_componets_positions().

4) I added more instrumentation, to measure hit rate, miss rate, cached
node hit rate, etc. It is actually pretty interesting to have, as it
gives some good info as to what is really going on.

5) My _multi_bisect_right does help. With no line cache, and only 100
leaf node (and internal node) caches:

get_components_positions() 28.480s => 14.574s
search_from_rev_id() 2.517s => 1.200s
miss_torture() 25.094s => 10.960s

It is also important to test with the exact same repository. As I
commit, the number of packs change, and the performance is impacted.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhr9c0ACgkQJdeBCYSNAANbEACfWRUKe6Y8RWa7VxWz00bhlQ5p
/AwAni1cV4MEiw/0ZCW9BfPNhcy7WvB/
=TfiH
-----END PGP SIGNATURE-----