Improvements in handling new checkouts (for large trees)

John Arbash Meinel john at arbash-meinel.com
Mon Apr 11 13:56:48 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A couple of changes have now landed in bzr 2.4, which I figured I'd
mention to people.

The first change (bug #737234) was to avoid redundant work when
accessing the whole inventory. We had a cache in place, but large trees
could overflow the cache. With the change, walking all entries in a 70k
tree drops from about >4min down to under 11s.

The most obvious place this effects is when doing a "bzr branch" to
create a new workspace, but similarly "bzr checkout" does the same work.
It is probably even more significant for "bzr co --lightweight" since
the extra data would be transferred over the network (8GB reduced to 150MB).

The follow up to that is bug #740932. Building the working tree wasn't
updating the cache information for all of the newly generated files. So
after building the working tree, the first 'bzr status' (or commit, even
if you never did status), would then re-read every file to see if it had
changed (and computed its sha1 hash).

The timing test is basically:

 time bzr co --lightweight ../big-branch test-tree
 cd test-tree
 time bzr st
 time bzr st

With bzr.dev 5729 (before the first patch) vs bzr.dev 5779 (current tip).

5729	5779
16m02s	8m11s	bzr co --lightweight
 3m43s	  26s	bzr st
    3s	   3s	bzr st

Under linux, the times all look a lot better, because creating a new
file on disk isn't nearly as expensive. However, that makes the CPU
overhead benefit much more apparent.

5729	5779
 3m39s	1m03s	bzr co --lightweight
   13s	   6s	bzr st
    1s	   1s	bzr st

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2jCKAACgkQJdeBCYSNAAOMUwCgzQiZ9HYmaQxSpsOWRnfl+K/t
zOIAn1UBKp4JERdTdYNNGZJ/sVmHa4tt
=1dY5
-----END PGP SIGNATURE-----



More information about the bazaar mailing list