Packs and performance..

Robert Collins robertc at robertcollins.net
Wed Nov 28 20:00:34 GMT 2007


Martin's asked that we try to have packs as default in the next release.
I'm seeing mixed reports of performance improvements with packs.
Specifically the commands that have been hammered on seem fast; but
other commands have regressed considerably. e.g.

per file log: 1.6 seconds to 13.3 seconds.
https://bugs.launchpad.net/bugs/172567

So I'd like to ask everyone that is running bzr.dev to:
 * convert to packs locally. [that is, back up your repository, bzr
reconcile; bzr upgrade --pack-0.92]
 * do your normal work. If you encounter a performance regression,
confirm it with your knit repository. Then look in this search:
https://bugs.edge.launchpad.net/bzr/+bugs?field.tag=packs
And see if that command is listed already. If it is not, please file a
bug.

In the bug we need:
 * The command arguments (anonymising them is fine, but knowing that you
used -r x..-1 vs -c etc etc is important)
 * A callgrind for the packs operation e.g.
bzr --lsprof-file ../packs.callgrind OPERATION
 
Callgrind files disclose the file path bzr is running from, but that is
all: http://launchpadlibrarian.net/10624940/packs.callgrind is a sample
you can see. You're welcome to run sed over the callgrind if the path is
important to you.

I'll make sure to look at all of these bugs quickly and try to identify
the root cause of any performance decreases.

Some will be easy to fix.

Some will require existing, but non-trivial, bugs to be fixed.

I had a chat on IRC earlier about this...
{{{
...
06:39 < luks> well, I have generally very mixed feeling about packs, so
I'm probably not motivated enough :)
06:40 < luks> maybe it's just me, but it feels a lot slower on many
local operations, which is what I'm primarily interested in
06:40 < lifeless> luks: file bugs, lots of bugs.
06:40 < luks> it's all caused by parsing the big indexes
06:40 < lifeless> luks: I know for absolute fact that its faster on many
local operations; but we're likely not looking at the same data set, nor
the same operations.
06:41 < luks> yep, simple things like diff or st are faster
06:41 < luks> historical diff and st, I mean
06:41 < luks> but for example just generating a bundle for 'send' is
slower
06:42 < lifeless> luks: this is why is was experimental in 0.92; and I
was expecting a little more time to find performance regressions (where
'all history' become more expensive, but 'partial history' became
possible.) and fix then
06:42 < lifeless> s/then/them/
06:42 < lifeless> bug 165309 will help with the index performance
06:42 < ubotu> Launchpad bug 165309 in bzr "pack index has no
topological locality of reference" [Medium,Confirmed]
https://launchpad.net/bugs/165309
06:42 < lifeless> but changing all-history API usage to partial-history
API usage is the key thing needed to improve performance
06:43 < lifeless> knits are incapable of partial-history operations
06:43 < lifeless> but they were tuned to do all-history very fast
06:43 < lifeless> it turned out that this was not a good enough solution
}}}

The point of this is that:
 - packs make different tradeoffs than knits
 - the tradeoffs permit greater scalability (millions of revisions,
millions of files) for this core layer of the system.
 - operations that access all history, *incrementally*, will be slower.

So we need to change code that has accommodated the constraint of knits
that we always read all history for an object, by using all history-apis
that were 'cheap' to start using partial-history and partial-history
apis. It's often trivial to do this, but we *need* documentation about
where it is happening. What things users are hitting.

So

PLEASE FILE BUGS.

That is all :)
-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071129/fa9f9558/attachment.pgp 


More information about the bazaar mailing list