BTree + CHK Inefficiencies

Gordon Tyler gordon.tyler at gmail.com
Sat Aug 7 14:35:42 BST 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/6/2010 5:46 PM, John Arbash Meinel wrote:
> Aside from that, I'd have to say "it depends". The big-daddy content
> will get put into the same pack-files as the small content, which means
> for extracting small content you have to skip over those big blocks.
> (better disk coherency if you split them up). We might even try to
> intermix the big content with the small content in a compression group,
> depending on how everything lays out.

Have you ever considered grouping content into compression groups by
content-type (or even just file extension), on the theory that similar
content will compress together better?

My other thought was to separate packs by the size of the content. i.e.
files < 1MB go in one pack and files > 1MB go in another, maybe with a
few more levels as necessary. The small files which change a lot in
comparison to the large files wouldn't cause repacking of the large files.

Ciao,
Gordon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJMXWEtAAoJEIrPJfWinA2u8nUH/RbXzK19Z4q6zHJI5FrUq5Et
eTfrP9CYF2hqV8o/W//CbSms/Krr2fbVTNniK1tlIBjErfpJHSoYO2Vj1tEC0unP
8JnFX6JoNKJRaefjYCIUPtjFoMdwLjaQ0MGUTmRFrl9lkxLpfEP/zZR2sWKYclO6
H2Tv6bQTVNtV3a/BPUtAIwlNosSE7BcOYFkv3rTryuL8/imTaS4pm+z+Uw3S4YC3
+0kzBQqqgzniHkw7z5Y0v/Udlht0vSVBaKU5h3bpFLev0ONtPJAcs2XoW25a46mn
7nKv1Btnr4E9qgEeYwMgs3s8zCtBKiTonACn1Bz8L5aBlt1z085vVB3v9rcGYaI=
=Ill6
-----END PGP SIGNATURE-----



More information about the bazaar mailing list