John Arbash Meinel
john at arbash-meinel.com
Wed Jul 2 22:40:29 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
I've done some more tweaking. And here are some interesting results:
Latest code is available at:
1) I added a value-level cache. So that when we read a leaf node, we
cache all the keys in that leaf node. This dramatically improved the
"iter_random_one" tests. Such that btree becomes *faster* than graph.
Some specific values:
~ 6.006 graph
~ 102.450 btree no key cache
~ 108.513 btree 10,000 key cache
~ 4.786 btree 100,000 key cache
~ 51.803 btree no key cache, 200 leaf node cache
~ 7.960 btree no key cache, 500 leaf node cache
~ 7.613 btree no key cache, 1,000 leaf node cache
~ 7.814 btree no key cache, 10,000 leaf node cache
With no cache, the 100 node leaf cache is about 50% efficient (it hits
50k times, and misses 54k times)
With a key cache of 10,000 entries, it hits 50% of the time (50.4k hits
vs 55k misses), but causes the leaf node cache to drop to 16k hits
versus 39k misses.
With a key cache of 100,000 entries, we hit 99% of the time (104k hits
versus 1k misses), but the leaf node cache only misses on all 1067 checks.
With a node cache of 500, 1,000, or 10,000 we hit 99% of the time, but
are half the speed of a key cache.
With a node cache of 200, we hit 77k versus 27k misses. It still costs
us a lot.
2) The LRUCache has a bit of overhead. Enough that doing (1) slowed down
things like search_from_revid.
3) The biggest penalty in graph.iter_ancestry() is that we search for
revision ids in indexes where it is not present. Specifically, we have
over 60,000 lookups that miss. Bloom only catches about 6,000 (10%) of that.
Getting rid of the leaf cache in favor of the line cache hurts a lot
here. Because we get to the leaf node, and have to read it again to
realize that the key we want isn't there.
I added the ability to cache that a revision_id isn't present (you add
_not_present_object into the cache). However, for iter_ancestry() we
don't actually revisit the same revisions multiple times. So it doesn't
improve things a lot. It did hit somewhat for get_componets_positions().
4) I added more instrumentation, to measure hit rate, miss rate, cached
node hit rate, etc. It is actually pretty interesting to have, as it
gives some good info as to what is really going on.
5) My _multi_bisect_right does help. With no line cache, and only 100
leaf node (and internal node) caches:
get_components_positions() 28.480s => 14.574s
search_from_rev_id() 2.517s => 1.200s
miss_torture() 25.094s => 10.960s
It is also important to test with the exact same repository. As I
commit, the number of packs change, and the performance is impacted.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the bazaar