[RFC] pack index lookup performance

Robert Collins robertc at robertcollins.net
Tue Apr 15 22:17:34 BST 2008


On Wed, 2008-04-16 at 00:10 +0300, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Aaron Bentley wrote:
> | Robert Collins wrote:
> |> Indeed; this is where the hit/miss ratio is useful because hits get
> |> cached per index, but there is no miss cache at the moment.
> |
> |> So if you ask a 1-record index that has been fully parsed for 15K keys
> |> it does not have, it does 15K bisections.
> |
> | It seems we should be examining the hit caches of all indices before
> | doing more expensive operations.  Are we doing that?

Not at the moment; what we do is ask each index in turn, and it returns
what it has. What John says below applies to each single index.

> Yes. At least according to the code comments.
> 
> The first loop determines what needs to be read from disk, and it does 3 checks.
> 
> 1) Is the key itself in the _bisect_nodes cache, do not read
> 2) Would the location of the key fit in the _parsed_key_map, do not read
> 3) Is the byte range for the key in our _parsed_byte_map ranges, do not read
> 
> Then we go and do disk/network reads, and parse the blocks into internal structures.
> 
> Then we have a second loop over each key

this second loop is to resolve the pointers for the dictionary
compression in each keys parents.

I have a slightly old branch
http://people.ubuntu.com/~robertc/baz2.0/index.range_map which cleans up
some of the internals. I don't recall its test status etc.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080416/386f4fb0/attachment.pgp 


More information about the bazaar mailing list