[MERGE] Reading pack-names triggers 3 reads

Robert Collins robertc at robertcollins.net
Mon Nov 10 09:32:52 GMT 2008


On Sun, 2008-11-09 at 16:27 -0600, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> John Arbash Meinel wrote:
> > While I was at it, I also found that 'BTree.iter_all_entries()' on the
> > pack-names file, causes it to be read 3 times.
> > 
> > 1) iter_all_entries() calls '.key_count()' which reads the root node
> > 2) It then proceeds to call _read_nodes() for all "needed" nodes, but
> > this includes the root node.
> > 3) _read_nodes() when it doesn't know the size of the index, would
> > trigger a read of the index, just to get its size, and then throw away
> > the bytes, just to read it again in the next call.
> > 
> > This patch adds both fixes, and finally drops the number of reads of
> > "pack-names" to 1 as it should be.
> > 
> 
> Here is a more appropriate fix which just changes reading pack-names
> from 3 times down to one. It includes tests, etc. This path really only
> happens for 'pack-names' because it is the only file that we don't know
> the size ahead of time.

So, I've read the patch and it looks generally ok. I had a few concerns:
 - doesn't the page cache handle a request for the root? (e.g. is the
cache not working, or is _read_nodes below that layer0.

 - I don't like the 'read full if we don't know size' change - I often
   work with bzrlib on big indices - hundred + megabytes, and this makes
   the api stop being memory capped - it becomes easier to get wrong.
   I appreciate the need to reduce round trips here, but I think simply
   changing it like you have will harm things. I'd like to see
   it require an explicit parameter or setting rather than be
   unconditional.

> There was also a subtle bug in the code if pack-names ever became larger
> than a single page (which is really unlikely, but theoretically possible).

I'm not sure where that bug was - could you enlarge on it?

bb:resubmit

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081110/33394d79/attachment.pgp 


More information about the bazaar mailing list