Bazaar-NG vs. Mercurial -- speed comparison

John Arbash Meinel john at arbash-meinel.com
Thu May 18 17:05:29 BST 2006


Robey Pointer wrote:
> 
> On 12 May 2006, at 23:44, John A Meinel wrote:
> 
>> Diwaker Gupta wrote:
>>> Bryan Sullivan (of Mercurial) recently posted this benchmark:
>>> <http://lists.freestandards.org/pipermail/lsb-futures/2006-May/002080.html>
>>>
>>
>> I wanted to post a little bit of a rebuttal to this. But first I would
>> like to say that Mercurial really does show off as a fast little system.
> 
> Sounds like a good time to bring the lsprof branch up to date!
> 
> I've now done so and its new home is here:
>     http://www.lag.net/~robey/code/bzr.dev.lsprof/
> 
> Using that branch, I did a sample "bzr branch" of bzr.dev from my local
> repository to a non-repo folder (forcing it to copy revisions), and
> posted the results here, for the curious:
>     http://www.lag.net/~robey/code/local-branch.html
> 
> The script I used to generate HTML from the lsprof output is here:
>     http://www.lag.net/~robey/code/lsprof-html.py
> 

That's a fun graph. I really like that all of the items are linked, so
you can easily navigate the call graph.

All that I really get out of it, is that our time is dominated by
"fetch()". Some of which is 'join' time, and some of which is
'fileids_altered_by_revision_ids'.

Looking at it, it would seem that bzr needs to start using more of a
temporary download cache (which might be memory-only in Transaction).
Because it would seem that it needs to download in
revision, inventory, texts order, but it should only add new entries in
texts, inventory, revisions order.

So it should download to a staging area in one direction, and then put
it into storage in the opposite direction.

Also, I'm guessing knits are actually more expensive than weaves when
you are latency bound, because now you have 2x the files (.knit and .kndx).

> 
>> In a local network, this is what I get:
>>
>> $ time hg clone http://juju.arbash-meinel.com:8000/
>> real    0m18.448s  user    0m5.906s   sys     0m4.346s
>>
>> $ time bzr get http://bzr.arbash-meinel.com/mirrors/bzr/bzr.dev/ http
>> real    1m49.052s  user    0m34.059s  sys     0m10.676s
>>
>> $ time bzr get sftp://juju/srv/bzr/public/mirrors/bzr/bzr.dev/ sftp
>> real    1m41.964s  user    0m36.068s  sys     0m10.979s
>>
>> So bzr still needs to do some catching up, but in a local network it is
>> only 6x slower. (Honestly I thought sftp would spank http, I don't know
>> whether this is good or bad :)
> 
> I'm curious why you thought that, by the way...
> 
> robey

In everything before 0.8 http was *way* slower than sftp. Primarily
because sftp could do an 'list_dir' and figure out what weaves needed to
be transfered, rather than grabbing the inventory.weave and parsing out
each inventory to get each file-id, so it would know what file-id.weave
files to go after, etc.

With VersionedFile and knits, I'm not sure how those code paths changed.
Most likely, sftp no longer uses 'list_dir', so it doesn't get the speed
boost. Probably this is best anyway, since it prevents cruft from
drifting through the system.

One of the nice thing about knits is that 'bzr get' doesn't pull
unreferenced revisions. So if I test merge you, and then change my mind,
my local branch has some extra revisions, but if someone else merges me,
they don't get your revisions.

So while sftp itself has slightly more overhead, the advanced
functionality *used* to make it much faster. (Try it with bzr 0.7, they
are like 10x different).

And finally, we do have pipelining enabled on sftp. But because bzr
doesn't use any of the *_multi requests, nor is it async, there are
probably never 2 requests in-flight anyway.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060518/76bedb43/attachment.pgp 


More information about the bazaar mailing list