[PACKS] Performance opportunities.
Robert Collins
robertc at robertcollins.net
Thu Aug 30 09:48:37 BST 2007
I haven't made these changes yet, but I've been profiling where the time
goes during commit. Timings in this tree are on my laptop; the tree is
an export of the HEAD of the mozilla sample tree we converted to bzr
some time back - 550MB of data, 55K files. The baseline is 4m3 seconds
user time, and 4m23
Three things so far stand out as things we don't /need/ to spend time
on.
One is having gzip files in the pack, rather than raw zlib. There is a
massive difference here - GzipFile takes 86 seconds, using zlib directly
a trivial implementation takes 38 seconds, to compress a 550MB tar. (the
gzip command line takes 36 seconds). Possibly we can fix up GzipFile,
but I have looked closely at it before, so I'm not convinced that its
worth doing this - packs are not zcattable, unlike .knit files which had
no delimiters between gzip objects. So we need our own debug tools
anyway. A rough and ready change to this shaved 30s off commit.
Secondly we sha the working tree twice on an initial commit (bzr init;
bzr add; bzr commit) because everything is a miss - thats only
~3seconds, but 3 seconds on a 263 is still > 1%.
Thirdly the way we store annotations has quite some overhead at the
moment. Turning our knit storage to use the PlainFactory rather than the
Annotated one saves 30 seconds.
So I have a prototype branch (it doesn't convert data, so its quite
un-interoperable as yet) where I have commit at:
no-anno, zlib direct:
real 3m24.990s
user 3m1.215s
sys 0m11.377s
no anno:
real 3m50.336s
user 3m34.897s
sys 0m10.941s
baseline (my normal packs branch off of bzr.dev):
real 4m23.884s
user 4m3.963s
sys 0m11.649s
Thats a 25% saving of userspace time.
Concretely, I plan to switch to using zlib directly in packs. I'll also
look at making the annotation cache be separate and disable-able.
I'm looking for critiques and 'good idea', 'bad idea' comments on this.
As well as suggestions for other things we can do in the short time
remaining before I'll need to start solidfying packs for 0.92 - when I'd
like to release the first user-exposed format.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070830/5842821f/attachment.pgp
More information about the bazaar
mailing list