[MERGE] Set allow_optimize=False when spilling btree content to disk

John Arbash Meinel john at arbash-meinel.com
Thu Mar 19 21:02:50 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> The attached patch changes the btree code slightly when spilling indices
> to disk to avoid memory pressure.

This patch is built upon the earlier one, though they are mostly
independent. It is more controversial, so I wanted to split it out, so
the other could fairly easily land.

This changes the spill-over code to no longer combine the spilled
indices in 'power-of-2'. This means that if your index splits as you
build it, it becomes more and more inefficient to query, which is a bad
thing. But it also means that you don't "waste" time rebuilding the
index, just to rebuild it when you finish.

I guess it also means that the final merge sort has fewer streams to
combine, though honestly that code has gotten pretty efficient.

The main motivation for this is that "time bzr pack launchpad-gc255big"
drops from a peak of about 12 minutes down to around 10m30s on my
laptop. LP has about 400k chk nodes and 240k text nodes, so it hits the
spill-to-disk quite a bit. (At just over 400k nodes, we have 2 combines,
and then one final overall combine.)

I'm also changing the 'autopack' and 'pack' code to set 'random_id=True'
because the stream source knows that it isn't sending duplicate nodes.
There is a small chance that it will be duplicate with another pack file
that isn't involved in an autopack, but that doesn't hurt anything. (I
will note that I think 'bzr pack' without setting random_id=True is
close to the same speed, because the lookup overhead is going up to
counteract the cost of repacking the files, but with this and that done,
it is definitely a lot faster on big trees.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknCsvoACgkQJdeBCYSNAAPPAgCfTRiQXHRK1dIenVg+VEnZ/zeJ
RroAoK7NubJ8rHybhtmWh1zgN3p/vFVx
=szDt
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 1.14-btree_spill_no_combine.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20090319/8fd863b0/attachment-0001.diff 


More information about the bazaar mailing list