Performance requirements for bzr checkout --lightweight

John Arbash Meinel john at arbash-meinel.com
Mon Sep 1 17:02:57 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Westby wrote:
> Hi,
> 
> I wasn't sure what the best way to raise this issue was, so I thought
> I would at least start with a mail to the list.
> 
> First of all, let me say that I realise that performance has been
> a major focus for a long while now, and things have improved
> a lot. I do not wish this email to insult anybody's work.

While that is true 'checkout --lightweight' has *never* been a primary focus.
It won't really scale, it doesn't let you do a local commit, etc. Shallow
branches are really the better solution for this, we just haven't gotten there
yet.

I'm not saying we can't spend some time to make it reasonably faster, I'm
mostly just pointing out that it is somewhat slow because it isn't something
we've ever optimized for.

> 
> There has been a discussion on ubuntu-devel recently about
> DistributedDevelopment, the plan to make all of Ubuntu available
> in bzr. One of the use cases that people are concerned about
> is sponsorhip of a package that they have not touched before,
> which is a fairly common occurrence. This currently involves
> getting the latest version of the code, applying a diff, building
> an testing, and them uploading.
> 
> In this discussion it was noted that the first time looking at
> a package is one area where bzr is going to be slower than
> current tools, as there is more to do than a simple "apt-get
> source". Some timings were done, and people had widely
> varying results, but testing with the latest and greatest
> showed that the times for "checkout --lightweight" were
> heading towards those of "apt-get source", but still
> too far away for comfort.

I'm glad you're having the discussion. I would be more curious to know what
the real numbers are. It is going to be very hard to beat "wget foo.tar.bz2"
because that is rather focused on just snapshotting the tip.
We can do a lot better once you already *have* a snapshot, as we can do
incremental updates (which .tar.bz2 doesn't do well).


> 
> I realise that emphasising one operation doesn't necessarily
> help with performance optimisation, but I would like to
> ask for some attention to be paid to this operation. Ideally
> a lightweight checkout of a packaging branch would take less
> than 150% or 200% of the time for an "apt-get source" of the
> same package.

So does this mean we are at 100x, 10x, 3x? There is a lot of variance here. If
we were at 210%, *I* probably wouldn't focus on it (whether other people feel
it is critical). If we are at 100x then something needs to be done.

Also, is this with bzr-1.5 or 1.6? I know 1.6(.1) got a lot better at fetching
large (lots of files, not necessary lots of history) repositories.

> 
> I am happy to provide branches and traces on demand, and I can
> ask others to provide them as well where they are seeing worse
> performance than I am.

With 1.6.1 we brought back latency multipliers, at a benefit of much better
streaming. But 1.5 with launchpad's 1.6 internally is a lot worse than 1.5
with 1.5, regardless the latency or bandwidth.

> 
> One slightly concerning this is that bzr+ssh:// is consistently
> slower than http:// for this case. I think this is partly this is
> due to the SSH negotiation at the start, and partly due to the
> number of smart server calls before the data is streamed out.
> I say concerning as documentation will specify lp: transports,
> assuming that they will work well for most cases, and I would
> rather not have to confuse the situation by trying to
> explain how to checkout over http and then switch to bzr+ssh://.

Well, if it was 1.6 (not 1.6.1) I would have an obvious answer, which is why I
released 1.6.1rc1. There might be a similar issue in 1.5 that we didn't really
work on.


> 
> You can find the root of the discussion at
> 
> https://lists.ubuntu.com/archives/ubuntu-devel/2008-August/026249.html
> 
> and the most interesting part for this mail starting at
> 
> https://lists.ubuntu.com/archives/ubuntu-devel/2008-August/026249.html
> 
> The timings from Emmet are at
> 
> https://lists.ubuntu.com/archives/ubuntu-devel/2008-August/026249.html
> 
> 
> Thanks,
> 
> James

I'm glad you provide links, but I would rather get summaries from you (when
possible) rather than having to read yet another mailing list :).

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIvBIxJdeBCYSNAAMRAtFdAJ4reuoA57EYSGqaXaazaNGUUGDxhwCfSsi6
jTxcS2ZHE2Vf8VxaerlSLRg=
=y7n1
-----END PGP SIGNATURE-----



More information about the bazaar mailing list