Split inventory prefix work
John Arbash Meinel
john at arbash-meinel.com
Tue Dec 23 00:34:00 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>
> So the big win is in the parent_id,basename => file_id map. Where we go
> from 800 leaf nodes to 563 leaf nodes. And a max depth of 14 to a max
> depth of 11. (Note the average depth of 8 doesn't change, but we go from
> 33 nodes on a page up to 52, and avg 8=>14).
>
> The actual "file_id=>entry" nodes don't change very much. 6005 leaf
> nodes down to 5991 leaf nodes.
>
> This is about what I expected, because I knew that the p_id map has a
> lot more redundancy (the parent_id portion of the key is duplicated
> across all of the files in that directory.)
>
> Anyway, I'm not proposing it for merge yet. I think it is overall
> beneficial, but it isn't a huge effect.
>
> John
> =:->
I also wanted to mention a specific concern about this. I'll call the
feature "prefix extraction", just to have a term for it.
My concern is how prefix extraction will interact with "delta compression".
As an example, imagine you have the keys aaa, aab, aac, which all have a
common prefix of 'aa', and all fit into the same leaf node. Then you add
'abb' which also fits into the same leaf node. Suddenly all of the lines
change, because you have a new common prefix (just 'a', instead of 'aa').
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAklQMfgACgkQJdeBCYSNAANMdwCgocEsQOou34ztN5LTKce9T+UY
s1MAoKBBAkHstASeAUI7oa/FF/GZEbXk
=blW3
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list