Split inventory prefix work

Mon Dec 22 21:11:40 GMT 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have a branch of the brisbane work, which is able to extract the
common prefix for a page into a separate record. For example, imagine
storing the keys: "aaa", "aab", "aac". The old code would have:

aaa\0value\n
aab\0value\n
aac\0value\n

The prefix work would change that to:

aa\n
a\0value\n
b\0value\n
c\0value\n

In general, this makes the serialized form smaller, which means you can
fit more entries in a 4k page.

In actually implementing it, I found an slightly unexpected result. For
the first 1k revs in mysql, doing this for the Leaf nodes actually
causes the total inventory size to *increase*. Both for 'raw' and for
'compressed' size. Doing it for Internal nodes decreases the raw and a
little bit the compressed size.

My guess is that because we are fitting more entries in a Leaf node,
when one line changes, it has to write the neighbor entries out.
(Consider if you had a tree where all the leaves had a single entry,
then changing an entry would only write a single new line.) I would
expect this result to be a bit different when we introduce delta
compression, as it should remove some of the extra redundancy.

Here are some results:

                      Raw    %    Compressed    %  Objects
No prefix       17905 KiB   1%      9503 KiB  43%    20970
prefix internal 17102 KiB   1%      9392 KiB  43%    20957
prefix both     17258 KiB   1%      9444 KiB  43%    19678

So you can see that when including leaf nodes the number of objects
decreases, but the compressed and raw sizes actually increase.

Comparing the raw details:
Extra Info:           count    total  avg stddev  min  max
internal node refs    11864   114461    9    8.1    2   27
internal p_id refs     1256     6840    5    5.7    2   24
inv depth              6005    35566    5    3.5    1    9
leaf node items        6005    34646    5    5.2    1   16
leaf p_id items         802     6566    8    9.1    1   33
p_id depth              802     7050    8    4.2    1   14

vs

Extra Info:           count    total  avg stddev  min  max
internal node refs    10952   111724   10    7.6    2   27
internal p_id refs     1129     6087    5    5.7    2   24
inv depth              5991    34565    5    3.2    1    9
leaf node items        5991    37466    6    5.4    1   16
leaf p_id items         563     7951   14   13.0    1   52
p_id depth              563     4816    8    3.5    1   11

So the big win is in the parent_id,basename => file_id map. Where we go
from 800 leaf nodes to 563 leaf nodes. And a max depth of 14 to a max
depth of 11. (Note the average depth of 8 doesn't change, but we go from
33 nodes on a page up to 52, and avg 8=>14).

The actual "file_id=>entry" nodes don't change very much. 6005 leaf
nodes down to 5991 leaf nodes.

This is about what I expected, because I knew that the p_id map has a
lot more redundancy (the parent_id portion of the key is duplicated
across all of the files in that directory.)

Anyway, I'm not proposing it for merge yet. I think it is overall
beneficial, but it isn't a huge effect.

John
=:->

PS> The branch is at:

http://bzr.arbash-meinel.com/branches/bzr/brisbane/prefix
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklQAowACgkQJdeBCYSNAAM5ggCdH/h5F5bn+W2agfpFh6cy8noZ
E0QAoL6h6skJJwNWwxTM0pV7HlVfSjt+
=F2Yh
-----END PGP SIGNATURE-----