Better compression

Fri Jul 25 21:24:53 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:

...

I did dig into this. And it seems that the decompression logic doesn't
pay attention to the fact that we are extracting multiple texts from the
same compressed hunk.

So it does a full decompress of the 2.4MB zdata => 10.8MB plain data and
the extracts the 1 text from there. And then it does that 10 times for
10 texts in the same zdata chunk.

Attached is a patch which just re-uses the last data source. It drops
the "bzr log --short r -10..-1" time from 1.4s => 0.6s.

Even better, it drops the "bzr log --short -r -100..-1" time from 9s =>
1.0s.

(Though packs are still faster at 0.44s and 0.6s respectively.)

Of course, if you go up to -r -300 it still gets better (30s => 1.5s.)

I would guess it scales indefinitely, simply because *all* of my
revision texts are probably stored in a single 10MB gc chunk. (zlib
compressed down to 2MB).

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkiKNpUACgkQJdeBCYSNAAPuTQCgxrG0ody+Fwbf31Cq9cx+3MTs
h1EAn2R9dm4L9jdPysteLJusJIORtgSW
=4+DK
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gc_extract_cache_last.txt
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20080725/0b0ef215/attachment.txt