Better compression

John Arbash Meinel john at arbash-meinel.com
Thu Jul 17 13:25:58 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Clatworthy wrote:
| Robert Collins wrote:
|> On Thu, 2008-07-17 at 16:00 +1000, Rob Weir wrote:
|>> On 17 Jul 2008, raindog at macrohmasheen.com wrote:
|>>> What level perf increase is seen with this? Do you have any
|>>> comparissons?  Sent from my Verizon Wireless BlackBerry
|>> With regards to space, I converted bzr.dev from pack-0.92 to
|>> btree+groupcompress:
|>>
|>> 85664   backup.bzr
|>> 40232   .bzr/
|>>
|>> a saving of over 50%.
|
| I saw a similar gain on Python last night: 346.5 MB -> 181.4 MB.
|
|> The compressor would have been getting texts in topological order there.
|> It does much better with reverse-topological order, which is why I need
|> a new InterRepository object or something... I tested a bzr.dev
|> repository and saw more like 30 MB with better ordering in use, but it
|> was a custom-hack rather than a reusable thing.
|
| So I'm wondering what that implies w.r.t. maximising how fast-import
| works. If we know that compressing in a forward direction will generally
| take more space than compressing going in the other direction, we'll
| probably always want to consume the stream *quickly*, then compress as
| much as we can during the final pack that fastimport does, yes?
|
| So I guess I'm requesting enough flexibility in our API to say
| "load quickly" so that fastimport doesn't take forever overly
| compressing stuff on a first pass which will only be thrown away.
|
| Ian C.
|
|

Another possibility is to just have "bzr pack" do the reordering.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkh/OlYACgkQJdeBCYSNAAO2uACgtyuxCh3aSlvWOxZxFhytTDwx
ZzwAoJDgZyRcllsVZPlngpnC5vT1GzxD
=+tcs
-----END PGP SIGNATURE-----



More information about the bazaar mailing list