dirstate: close to merging, and some performance analysis.
Robert Collins
robertc at robertcollins.net
Mon Mar 5 05:07:28 GMT 2007
Hi,
Dirstate is -very- close to mergable now. I'm working on upgrade code
for it, to let people upgrade their trees, and we've got some pretty
darn good performance results. Theres *still* a regression on xml
support from 0.14 - and I suspect that whatever its related too may
affect dirstate trees too - which is good, because we're currently nice
and fast even with it. But as we fallback to some of the same code
paths, fixing it before release is important.
The follow is on a tree with 4000 entries in it, on my laptop. Each
command is run twice, to reduce cold-cache effects.
bzr 0.14:
$ time bzr --no-plugins st xml/
real 0m4.556s
user 0m1.764s
sys 0m0.216s
$ time bzr --no-plugins st xml/
real 0m1.976s
user 0m1.756s
sys 0m0.128s
dirstate on xml:
$ time bzr --no-plugins st xml/
real 0m3.550s
user 0m3.160s
sys 0m0.196s
$ time bzr --no-plugins st xml/
real 0m3.200s
user 0m2.864s
sys 0m0.204s
dirstate with dirstate:
$ time bzr --no-plugins st dirstate/
real 0m2.265s
user 0m0.732s
sys 0m0.176s
$ time bzr --no-plugins st dirstate/
real 0m0.935s
user 0m0.728s
sys 0m0.128s
Comparing the best user + sys times for 0.14 to dirstate with dirstate
we have:
0.14 : 1.884s
dirstate: 0.856s
or a 54% time reduction: quite nice.
IIRC John has measured startup time at ~200ms, and removing that from
the comparison:
0.14 : 1.684s
dirstate: 0.656s
Gives a 61% reduction.
Comparing partial tree operations, same tree, but a subdir with 3380
paths under it:
0.14:
$ time bzr --no-plugins st xml/subtree
real 0m1.716s
user 0m1.576s
sys 0m0.136s
$ time bzr --no-plugins st xml/subtree
real 0m1.714s
user 0m1.580s
sys 0m0.136s
dirstate xml:
$ time bzr --no-plugins st xml/subtree
real 0m2.690s
user 0m2.472s
sys 0m0.220s
$ time bzr --no-plugins st xml/subtree
real 0m2.692s
user 0m2.500s
sys 0m0.188s
dirstate dirstate:
$ time bzr --no-plugins st xml/subtree
real 0m0.771s
user 0m0.672s
sys 0m0.100s
$ time bzr --no-plugins st xml/subtree
real 0m0.774s
user 0m0.696s
sys 0m0.076s
Again, comparing best times from 0.14 to dirstate with dirstate:
including startup excluding startup
0.14: 1.712s 1.512
dirstate: 0.772s 0.572s
reduction: 55% 0.62%
These figures are in line with the whole tree reduction, which seems
fine.
lastly, what about status of a single file in the same tree?
0.14:
$ time bzr --no-plugins st xml/subtree/file
real 0m0.959s
user 0m0.848s
sys 0m0.112s
$ time bzr --no-plugins st xml/subtree/file
real 0m0.959s
user 0m0.884s
sys 0m0.072s
dirstate xml:
$ time bzr --no-plugins st xml/subtree/file
real 0m1.046s
user 0m0.944s
sys 0m0.100s
$ time bzr --no-plugins st xml/subtree/file
real 0m1.046s
user 0m0.960s
sys 0m0.084s
dirstate dirstate:
$ time bzr --no-plugins st xml/subtree/file
real 0m0.356s
user 0m0.308s
sys 0m0.040s
$ time bzr --no-plugins st xml/subtree/file
real 0m0.355s
user 0m0.292s
sys 0m0.060s
Again, comparing best times from 0.14 to dirstate with dirstate:
including startup excluding startup
0.14: 0.956s 0.756s
dirstate: 0.348s 0.148s
reduction: 63% 0.80%
An interesting question is 'how long does it take each file to
process?'. I dont think I've plotted enough data, or done enough runs,
to say with great confidence, but heres a few interesting figures.
Firstly, lets eliminate the time required to get the data: opening the
tree and loading the dirstate or inventory.
For 0.14 on xml: 0.265s, and
for dirstate, thats 0.058s (both according to timeit).
$ PYTHONPATH=~/source/baz/dirstate.dogfood/ python -m timeit -s "import
bzrlib.workingtree"
"tree=bzrlib.workingtree.WorkingTree.open('dirstate')"
"tree.lock_read(); tree.current_dirstate()._read_dirblocks_if_needed();
tree.unlock()"
$ python -m timeit -s "import bzrlib.workingtree"
"tree=bzrlib.workingtree.WorkingTree.open('xml')" "tree.lock_read();
tree.inventory; tree.unlock()"
These trees have 4153 files (to be precise :).
The startup cost-per-file (assuming neither format has partial-reading
facilities) is then 0.265/4153 and 0.058/4153
0.14 : 0.265/4153 = .0000638s
dirstate: 0.058/4153 = .0000140s
reduction 78%
Subtracting all these overheads gives us a hopefully raw cost:
Full(4153) subtree(3380) one file(1)
0.14 : 1.419s 1.247s 0.491
dirstate: 0.598s 0.514s 0.090
However its clearly not the real base cost, because the one-file figure
is still too high a proportion: so we're missing some constant overheads
(massive ones in the 0.14 case apparently :). Profiling is needed to
ascertain what they are. We can however, approximate the cost-per-path
by subtracting the one-file total cost, and dividing by the remaining
file count:
Full(4152) subtree(3379) one file(0)
0.14 : 0.928s=>0.22ms 0.756s=>0.22ms 0/0 ???
dirstate: 0.508s=>0.12ms 0.424s=>0.13ms 0/0 ???
I think these figures are fairly representative: Once we eliminate bzr
statup (0.2seconds), tree opening and state reading (0.265s for xml,
0.058s for dirstate), and (unknown startup time + time to do one file)
(1/2 a second for xml!, 0.09s for dirstate) we get a figure of 0.22ms
per path for 0.14 xml trees, and 0.13ms per path for dirstate trees.
(Note that startups will include is_ignored regex compilation).
Going back to the totals:
total startups perfile-total ratio
0.14 : 1.884s 0.956 0.928s 49%
dirstate: 0.856s 0.348 0.508s 59%
So dirstate has saved about 1 second, made up of 600ms from total setup
overheads, and 400ms from per-file processing. 59% of the remaining time
dirstate spends on 4000 entry trees is in per-file processing (so that
sounds like a good place to focus on reducing the total time for large
trees: at 4000 its dominating (just), and 55K for mozilla, well heres a
guess:
0.348s -0.058 + 55000*(0.13ms + 0.014ms) = 7.789s
At that scale, 95% of the time spent is in the scales-per-path part of
the equation, and of that 89% is in the process-a-single-path overhead,
not in the loading of entries from disk.
So the next challenge for performance tuning is dropping that 0.13ms
dramatically IMO: if we can drop the cost per path by another 50%, we
would be sitting really pretty for performance, I think.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070305/a050f370/attachment-0001.pgp
More information about the bazaar
mailing list