[MERGE] 'bzr pack' tells the index builder to optimize

John Arbash Meinel john at arbash-meinel.com
Thu Oct 16 21:05:22 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The attached patch updates the OptimisingPacker, to tell the index
builders that it wants them to optimize as well.

For my mostly bzr repository (w/ some plugins, etc) it changes:

real    0m36.177s
5.7M    .bzr/repository/indices

into

real    0m45.786s
4.5M    .bzr/repository/indices


So it is about 26% slower, but it compacts down to about 79% (-21%).

The overall effect of this depends mostly on the operation. For
operations like "bzr log --short", there isn't a big difference because
before and after still only need to read a single page from the index.

However, 'bzr log --long' can see a more significant benefit, because it
downloads most/all of the revision index. I created a snapshot of the
bzr.dev repository in btree format, with compact indices, non-compact
indices, and compact indices in a single pack. I also used my prefetch
branch for these tests.

Here are the times for 'bzr log --long -r -10..-1':

btree
real    0m25.069s
real    0m18.018s
real    0m21.404s
real    0m19.546s
real    0m20.748s

btree compact
real    0m21.528s
real    0m16.786s
real    0m17.004s
real    0m17.612s
real    0m20.389s

btree compact + single packfile
real    0m19.937s
real    0m24.055s
real    0m16.427s
real    0m16.551s
real    0m15.460s

I'm including the raw numbers so someone like Matthew can run his
ministat program and give us nice graph :). Going off of the 'best'
times, it ends up being:

18.0s vs 16.8s vs 15.4s

So about 7% faster. Not a huge difference in final effect. I would
imagine it has a larger effect on things that access the text graph. I'm
guessing that because those graphs compress better, and thus are more
effected by the strength of packing.

du -ksh sizes .bzr/repository/indices
0.92-pack	9.5M
btree		4.6M	-4.9M	48%
btree compact	3.6M	-1.0M	78%
compact single  3.6M	-0.0M	100%

.rix
0.92-pack	1.6M
btree		787k	-813k	49%
btree compact	709k	 -78k	90%
compact single  700k      -9k	99%

.tix
0.92-pack	5.7M
btree		2.7M	-3M	47%
btree compact	1.9M	-0.8M	70%
compact single  1.9M    -0.0M	100%

You can see that the extra packing saves approx 1MB of index space, but
80% of that is the 0.8M saved in the .tix files.

So the 7% improvement being from a 10% smaller index is probably
reasonable. But it also means we could see a 30% improvement on things
like "bzr log file".

Anyway, patch is attached for your review,

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkj3noIACgkQJdeBCYSNAAMq0QCfWmHVnJB7/11ygwjjq7YNCEaS
tj4An1d1Iwo13hOYMTQm5G0Lr0LiJvPN
=PJK6
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: btree_optimize.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20081016/50564254/attachment-0001.diff 


More information about the bazaar mailing list