B+Tree indices: ongoing progress
Robert Collins
robertc at robertcollins.net
Wed Jul 2 04:39:04 BST 2008
On Wed, 2008-07-02 at 13:19 +1000, Robert Collins wrote:
> And some good news:
> Baseline overhead: Read 92994 keys in 7.075
> GraphIndex: Wrote 10147567 bytes in 10.949
> GraphIndex: iter_all_entries in 9.671
> GraphIndex: iter_random_one in 11.915
> GraphIndex: get_components_positions(2000) in 20.535
> GraphIndex: search_from_revid in 6.516
> GraphIndex: miss torture in 343.138
> GraphIndex: -------Done---------
> Baseline overhead: Read 92994 keys in 7.072
> BTreeIndex: Wrote 4104336 bytes in 20.299
> BTreeIndex: iter_all_entries in 1.850
> BTreeIndex: iter_random_one in 115.365
> BTreeIndex: get_components_positions(2000) in 33.332
> BTreeIndex: search_from_revid in 4.581
> BTreeIndex: miss torture in 41.803
> BTreeIndex: -------Done---------
>
> Note the search_from_revid and miss torture results.
Some next steps:
- I plan to integrate bloom filters into internal nodes. I'll do this
by reserving some fixed byte count (say 200 bytes) and assuming the
filter won't compress at all. I'll also propogate every key seen all the
way up the in-progress tree. This will also need a 'complete' filter
which will accumulate every key so that when we do add a new row we
don't have to pass over seen keys again.
- iter_random_one concerns me, but I see John is hacking on the lookup
performance, which could help that. A large cache size or even dynamic
cache may help, but I'd like a bloom in place before that to avoid
misses causing cache exhaustion and skewing results.
- I'll add a rich root/subtree formats shortly for more real world
testing.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080702/6806e20e/attachment.pgp
More information about the bazaar
mailing list