New toy for bzr developers: B+Tree index sketch
Robert Collins
robertc at robertcollins.net
Tue Jul 1 21:36:18 BST 2008
On Tue, 2008-07-01 at 14:07 -0500, John Arbash Meinel wrote:
> | Not tested yet:
> | - high latency (NFS or Remote or http) links
>
> Did you have simple scripts to make it easy to reproduce your results?
yeah, I sent that at midnight; brain not worky so well. I'll add a
couple of extra things and make it more parameterised and attach it
later today.
Thanks for spending time tuning the size. I don't think 4K is
necessarily best; Indeed, 64K would clearly be better in terms of
accessing remote servers *if* we get a high hit rate. (Reading 128K to
access one key/determine one key is missing is bad compared to 8K; we
want lots of accesses to the same node to accommodate the overhead of
the read).
> John
> =:->
>
> PS> It would all be so much simpler if we could just snapshot the state
> of the compressor. Do the flush, restore the state, etc. zlib itself
> supports it with deflateCopy(). Though it does warn:
there is a .copy() on compressobj - I checked the python code :). But it
depends if the zlib present when python is built has deflateCopy, so I
felt it was likely to give use installation issues.
> ... Note that deflateCopy duplicates the internal compression state
> which can be quite large, so this strategy is slow and can consume lots
> of memory.
OTOH zlib is used for very large data sets, and our typical indices are
not; individual pages are definitely not.
> I also tried switching to 8192-sized pack files. And with my code, it
> drops the size to 3.2MB. So about a 3% savings. Certainly not nearly as
> much as just trying to pack more into the existing 4096 bytes.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080702/c6701b47/attachment.pgp
More information about the bazaar
mailing list