Bazaar-NG vs. Mercurial -- speed comparison

Thu May 18 20:12:00 BST 2006

John A Meinel <john <at> arbash-meinel.com> writes:

> First, mercurial use a custom server rather than working off of 'plain'
> http/sftp.

It actually works over plain http, though you can't push over HTTP yet.

> This does give it a huge latency advantage. It has some
> drawbacks, as in it is another thing that needs to be setup, holes
> opened in firewalls, etc.

We tunnel over regular ssh, or you can install a CGI script to serve
over an existing http server on port 80.  Either mechanism takes about
two minutes to set up and debug.

> I should revisit my
> performance testing with knits.

My performance tests were run using knits.

> Mercurial has used 'revfiles' for a long
> time, which are very similar to knits.

The only similarity is that both are append-only.  Revlog files do not
represent a weave, while knits do.  I suspect that you have to read an entire
knit file to reconstruct the most recent revision, while a revlog file has
a small upper bound on the amount of data you have to read to reconstruct
any revision.

> (Also, on one of my
> production servers, I don't install gcc, which makes mercurial difficult
> to install, and it is where I host some repositories)

Most people install Mercurial and Bazaar-NG from binary packages, though,
so this is not an issue for either in practice.

> Bzr could be better about not having to load support for all of its
> features until they are actually needed. 'hg' actually uses a solution
> called 'demandload', which we probably could just move directly into the
> bzr code.

You should grab it and use it.  It's very nice.  It imposes a one-time cost
when an attribute of a demand-loaded symbol is looked up; after that, the
importing namespace is patched with the real module, so there's no
subsequent cost.

> Timing hg clone of hg code isn't quite the same as timing bzr.dev code.

Right.  That's why I used a Linux kernel tree for my benchmarks.

> Notice that this time, we actually create a new branch faster than hg
> without using hardlinks, and in a system which has approximately a 1.1s
> startup overhead.

Actually, this is not the case.  See below:

> Now, I admit to a little bit of cheating, in that I didn't create
> working trees here.

You did, in the Mercurial case.  You need to use "clone -U" to avoid
populating the working directory.

> But I did want to make people aware that hg isn't 30x faster
> than bzr. In many cases it is more in the 4-7x range.

I only found one case where hg was 33x faster, and that was cloning over
a WAN, where bzr seems to be doing something very inefficient.

But in most other cases, it was more like 8x to 12x faster in
my tests.