Bazaar 1.6 released - some benchmarks.
John Arbash Meinel
john at arbash-meinel.com
Wed Aug 27 17:31:49 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
David Ingamells wrote:
> I have to say that my first experience with bzr 1.6 performance is very
> disappointing. I observe times more that 300% slower than bzr version
> 1.5 (which itself showed no improvements over bzr 1.2). Checkout
> --lightweight is, in some cases, even worse.
> Working with a repos with only a few revisions (33) but LOTS of files:
So I think it probably comes down to some sort of latency/per file multiplier.
Because for bzr.dev, which has lots of revisions (19,000 revisions, 881
files), I see the opposite:
$ bzr1.5 branch
23.85s user 11.58s system 41% cpu 1:24.82 total
$ bzr.dev branch
46.97s user 22.43s system 88% cpu 1:18.85 total
Now, if I simulate latency using:
sudo tc qdisc add dev lo root netem delay 200ms
(effectively 400ms ping time on the loopback)
Then the latency aspect of 1.6 shows up:
$ bzr1.5 branch bzr+ssh
24.80s user 12.98s system 23% cpu 2:40.82 total
$ bzr.dev branch bzr+ssh
50.77s user 22.91s system 29% cpu 4:09.95 total
$ bzr1.5 branch sftp
65.32s user 1.17s system 23% cpu 4:47.68 total
$ bzr1.6 branch sftp
43.77s user 1.08s system 13% cpu 5:44.32 total
Basically, Robert changed the fetch code, to have better layering, and the
ability to issue single requests for multiple file histories. In doing so, we
"lost" an RPC verb that 1.5 used to efficiently define a stream that we would
So now the inspection of ancestry graphs and per-file graphs is being done by
the client again. And probably you are running into latency for each file graph.
Actually, something weird seems to be happening. Looking closely at "bzr
branch -Dhpss bzr+ssh:///" I see stuff like:
44.755 hpss call w/readv: 'readv', '...8b7b7.pack'
44.755 30 bytes in readv request
44.756 result: 0.001s 'readv',
44.756 1428 body bytes read
44.756 hpss call w/readv: 'readv', '...24b20.pack'
44.756 28 bytes in readv request
44.757 result: 0.001s 'readv',
44.757 653 body bytes read
44.757 hpss call w/readv: 'readv', '...aac8e.pack'
44.757 18 bytes in readv request
44.758 result: 0.001s 'readv',
44.758 3655 body bytes read
44.767 hpss call w/readv: 'readv', '...aac8e.pack'
44.767 18 bytes in readv request
44.768 result: 0.001s 'readv',
44.768 6588 body bytes read
44.771 hpss call w/readv: 'readv', '...aac8e.pack'
44.771 17 bytes in readv request
44.772 result: 0.001s 'readv',
44.772 571 body bytes read
44.773 hpss call w/readv: 'readv', '...aac8e.pack'
44.773 18 bytes in readv request
44.774 result: 0.001s 'readv',
44.774 1339 body bytes read
44.952 hpss call w/readv: 'readv', '...aac8e.pack'
44.952 61 bytes in readv request
44.954 result: 0.002s 'readv',
44.954 53257 body bytes read
44.961 hpss call w/readv: 'readv', '...8b7b7.pack'
44.961 16 bytes in readv request
44.962 result: 0.001s 'readv',
44.962 202 body bytes read
Which means that we are issuing a request 2 pack A, B, C, C, C, C, C, A.
I'll try to dig into why we would not be collapsing 5 request in the same pack
into a single request.
> I ran these tests with lots of hope for improvements, and really wanted
> to be able to send good news. I would love it if someone can point out a
> stupid error I've made which will turn the figures around. Maybe I've
> misunderstood the intentions of stacked branches. I have shown here that
> they do save lots of disk space, but I had also expected performance
> similar to bzr 1.5's checkout --lightweight. I have shown that branch
> --stacked and checkout --lightweight are similar in version 1.6 (which
> makes sense), but why are all the timings much slower than their
> equivalents in bzr 1.5???
You *are* mistaken about the use case for "--stacked" branches. They are a
stepping stone towards "shallow" branches, which would be used for your use case.
Basically, "stacked" allows you to have some data locally, but not all of it.
But the default implementation copies "0" data, so all of it has to be fetched
from the remote host. So it is effectively a --lightweight checkout at that
point. The idea of "shallow" branches, is that when you branch it will copy
enough information to at least be able to build the *current* working tree,
without having to access the remote host.
- --stacked as it stands is more about a "publish my changes without a working
tree" use case. Which is very nice for stuff like "bzr push lp:" where you
can't publish into a shared repository.
I'll try to dig into the other pieces.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the bazaar