Alternate go at fast pull: bzrbundler

John Arbash Meinel john at
Tue Sep 5 15:54:50 BST 2006

Lalo Martins wrote:
> Ok, so the rpc idea sounded nice, but didn't work so well in practise (the
> bzrlib abstractions don't fit it easily).
> But this week I actually sat to learn mercurial, and I realised it's not
> really very hard for bzr to do what it does, since we already have
> bundles.  What I needed was something on the server side that generates
> bundles for me, and the ability to use that from the client side.


> - recognise a bundler as a bundle if you're currently inside a branch (in
> which case it will generate a bundle from your current tip to the remote
> branch's tip); this allows you to `bzr merge` from a bundler
> - allow you to branch from a bundle if that bundle is "rooted" at the
> empty tree (null revision), so you can `bzr branch` from a bundler
> If people generally agree with these changes, I can reimplement them as
> changes to and post a bundle (with unit tests).  Right now, in an
> experimental stage, I'm more comfortable with them as a plugin.
> This is not a substitute to the complete smart server that's being worked
> on, but it's a quick fix that makes sense (IMO) on the bzr model, and that
> allows us to use bzr more quickly until the smart server lands.

I do believe the smart server is going to be designed to communicate in
bundles when possible.

> Known issues:
> - Branching from a bundle spews "Inventory sha hash mismatch" for *every*
> revision; which probably means I'm doing something wrong, but the
> resulting branch always has the right contents, so I don't know that it is.

I don't remember the exact details of this. It turns out the Revision
entries have an 'inventory_sha1' hash recorded, but it is never checked.
And honestly, we want to get rid of it, and replace it with a
testament_v1_sha1='' element instead. Inventory hash can change whenever
you change the serialization format, but testaments are designed to stay
consistent, and only include the stuff you really want to testify to.
(Which is why they are used for revision signing).

> - As a bundle is not a branch, `bzr branch -r` can't use revision numbers.
> It does work with revids though.  It might be possible to fake revno
> support by introspecting the bundle info, I'm just not sure it's worth it.

This is something that I think would be good to do for the next format
of Bundles. Basically, they will still need a source branch to get old
ancestry from (and to have the context for patching), but if we can make
them internally look just like a different branch format, things will be
a lot easier.

> - The "basis" argument of branch is not implemented; not sure it even
> makes sense in this context.
> - I only bothered to write a mod_python handler.  A CGI (or WSGI) version
> is left as exercise to the reader.  Bundles (or bundlers) with one would
> be gladly accepted.
> Branching or merging from the bundler (from a remote server) gives me
> dramatic speed increases; small branches (such as bzrbundler itself)
> are almost instantaneous.  A larger branch (955 revisions, 798 items
> in the inventory), however, took a long time to compute and download
> the bundle, then the whole process took longer than the "usual" way
> (didn't complete yet, so I don't know how long), and used horrible
> amounts of CPU and memory, so it may not be really an option.

Is it really that much faster than pycurl + http for small branches?

For large branches, yes, it will be unbearably slow. Bundles have only
really been designed for passing around a few patches. They actually
have to diff every revision against its rightmost ancestor, and thereby
do all sorts of really crummy things.

The next format for Bundles will be a lot better. (I wrote the original
code as 'changesets' almost 1 year ago, and it hasn't aged well.)

There is a spec for some of the changes that we want to make:

I'm not able to work on it right now, but if you are interested, I would
be happy to help you get going on it.

> I have a rough plan for adding a "submit" command to the bundler
> protocol; it will get a signed bundle, verify it against a keyring
> specified in the configuration (for basic access control), and apply
> it to the branch.  I'll get to it later this week if I get another
> large chunk of free time.
> best,
>                                                Lalo Martins

Because Martin and Andrew Bennetts are hard at work with a smart server,
probably your time would be best spent elsewhere. I realize they need to
be more vocal on this list about their status, though. Otherwise people
don't realize what is going on.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : 

More information about the bazaar mailing list