Better compression
John Arbash Meinel
john at arbash-meinel.com
Fri Jul 25 21:24:53 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Arbash Meinel wrote:
...
|
| 4) Something is causing a plain "bzr log --short -r -10..-1" to be
| slower with the new repository. Specifically for bzr.dev on a fully
| packed repository on both sides:
|
| $ time bzr log dev-gc/bzr.dev
| real 0m1.372s
|
| $ time bzr log dev-packs/bzr.dev
| real 0m0.499s
|
| So we should probably investigate what is happening here.
I did dig into this. And it seems that the decompression logic doesn't
pay attention to the fact that we are extracting multiple texts from the
same compressed hunk.
So it does a full decompress of the 2.4MB zdata => 10.8MB plain data and
the extracts the 1 text from there. And then it does that 10 times for
10 texts in the same zdata chunk.
Attached is a patch which just re-uses the last data source. It drops
the "bzr log --short r -10..-1" time from 1.4s => 0.6s.
Even better, it drops the "bzr log --short -r -100..-1" time from 9s =>
1.0s.
(Though packs are still faster at 0.44s and 0.6s respectively.)
Of course, if you go up to -r -300 it still gets better (30s => 1.5s.)
I would guess it scales indefinitely, simply because *all* of my
revision texts are probably stored in a single 10MB gc chunk. (zlib
compressed down to 2MB).
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkiKNpUACgkQJdeBCYSNAAPuTQCgxrG0ody+Fwbf31Cq9cx+3MTs
h1EAn2R9dm4L9jdPysteLJusJIORtgSW
=4+DK
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gc_extract_cache_last.txt
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20080725/0b0ef215/attachment.txt
More information about the bazaar
mailing list