B+Tree indices: ongoing progress

Robert Collins robertc at robertcollins.net
Wed Jul 2 14:07:05 BST 2008


On Wed, 2008-07-02 at 17:44 +1000, Robert Collins wrote:
> 
> Which I find useful; but its not conclusive. More news tomorrow I
> think,
> modulo an update with a better test script and so forth before I
> retire.

I've pushed a new plugin version. I haven't merged John's latest patch
yet - sorry. What I have done is:
 - made an indexbench command
 - modularised the fixtures somewhat
 - added lsprof integration of just the fixture (just missing done so
far).
 - added disk cache dropping facilities
 - made all of these optional.
 - added in bloom filters to the creation of internal nodes
 - created optional use of the bloom filters when reading to avoid
   leaf node IO.

So far I've learnt that we can save 6 seconds on the missing torture
test (where very large key sets are used) by not sorting the keys. So I
did that.

I haven't yet added a rich root variation of the format, or made the use
of the bloom filters optional-from-the-command-line, which would be good
for doing comparison tests.

We probably want to change all the fixtures to use a CombinedGraphIndex
too, to better test the impact of having multiple indices present - some
of the tests, e.g. iter_random_one are really not effectively testing
the access pattern. Alternatively, having iter_random_one and
iter_random_one_all_indices would allow comparison of timings.

Current figures:
$ bzr  indexbench ~/source/baz --no-graph --drop-cache
Baseline overhead: Read 92994 keys in 11.223
BTreeIndex: Wrote 4108432 bytes in 29.438
BTreeIndex: iter_all_entries in 4.502
BTreeIndex: iter_random_one in 119.160
BTreeIndex: get_components_positions(2000) in 53.596, bloom(0)
BTreeIndex: search_from_revid in 6.309, bloom(0)
BTreeIndex: miss torture in 39.437, bloom(0)
BTreeIndex: -------Done---------

And with that, gnight.
-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080702/fde2880a/attachment.pgp 


More information about the bazaar mailing list