Performance improvements for bzr-2.4 on large trees

Martitza Mendez martitzam at gmail.com
Wed May 18 00:18:25 UTC 2011


On Tue, May 17, 2011 at 8:04 AM, John Arbash Meinel
<john at arbash-meinel.com>wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> The last of my patches is queued up to land, so I figured I'd post an
> update about the performance improvements I've been working on. I'm also
> just excited about how well it has all come together.
>
> There were essentially 3 changes that mattered for performance on large
> trees.
>
> 1) Fixing iter_entries_by_dir() to preload the data in Repository-
>   optimal ordering rather than by-request ordering. In large trees
>   this was causing us to thrash and become pathologically slow.
>   In the 70k-entry test tree, thrashing took about 3 minutes, the
>   preloading version takes about 15s. This affected a lot of our
>   commands, though I guess the next two fixes would actually reduce
>   the number of commands affected by this.
>
> 2) Fixing several code paths to use optimized iter_changes() rather
>   than the generic iter_changes(). The generic path walks both
>   inventories iter_entries_by_dir() and compares them. Our 2a format
>   Repository can do iter_changes without loading the whole tree. (it
>   internally uses a hash_trie to store the inventory, and so nodes
>   with matching sub-trees can be skipped for comparison.)
>   This generally shows up as something that was taking 15s (to load
>   the whole inventory) dropping to <2s for the improved comparison.
>   (bzr revert and bzr pull were both directly impacted here)
>
> 3) Changing WT.set_parent_trees([one_tree]) to update itself using
>   current_basis.iter_changes(one_tree), rather than setting the state
>   from scratch.
>   This basically adds another case where we can avoid reading the
>   whole inventory state again, which is another 15s to <2s sort of
>   change.
>   This only showed up after fixing (2), because once the tree is
>   loaded, the other actions are generally pretty quick.
>   (bzr up, bzr pull)
>
> This is the chart I put together for "whats-new-in-2.4.txt". bzr-2.3.2
> will have fix (1), but not (2) or (3), to give a feel for how much of an
> impact different fixes have had.
>
>    bzr-2.3.1 bzr-2.3.2 bzr-2.4  action
>    3m39s         1m08s   1m03s  bzr co --lightweight
>      38s            8s      2s  bzr revert (in a clean tree)
>    4m47s         3m56s     15s  bzr merge
>    4m45s           20s      3s  bzr pull
>    4m58s         3m00s      2s  bzr up
>    9m33s           21s     19s  bzr uncommit (including a merge)
>    4m44s           17s      2s  bzr uncommit (simple commit)
>
> So yes, some operations that were taking almost 5 minutes have now
> dropped down to taking <3s.
>
> You won't see that dramatic of an improvement for smaller trees, though
> most cases will have a pleasant improvement. Here is a short list for
> the 'Launchpad' tree (with ~8k items).
>
>    bzr-2.3.1   bzr-2.4     action
>    5.3s        5.2s        bzr co --lightweight
>    0.9s        0.3s        bzr revert
>    1.4s        0.4s        bzr pull
>    3.9s        3.7s        bzr uncommit (with merge)
>    0.9s        0.3s        bzr uncommit (without merge)
>
> Anyway, I'm quite happy about how much better bzr-2.4 will be in large
> trees. The remaining cases for merge/update are something that probably
> need a new format to address properly. (Basically, we want to stop
> caching extra-parent trees in the dirstate file.)
>
> John
> =:->
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk3SjpQACgkQJdeBCYSNAAN5aQCgl27yfn75mXgTt4zmT4JJQQ9q
> GdIAoLbsF6GrmTppddobcgbXNT2cVbnN
> =/CFJ
> -----END PGP SIGNATURE-----
>
>
This is beyond awesome!  Ok, more exclamations: !!!

I will be trying this out on the command line as soon as it lands in lp:bzr
and again when it gets packaged up as part of the standalone installer.
Thank you thank you thank you!!!

~M
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/bazaar/attachments/20110517/19a9647a/attachment-0001.html>


More information about the bazaar mailing list