[rfc] bzr-colo into core

Mon Mar 21 10:36:57 UTC 2011

> From: Martin Pool <mbp at canonical.com>
> Date: Mon, 21 Mar 2011 18:48:36 +1100
> 
> Let's make bzr-colo the default model for bzr 3.0 in September.

I didn't yet have chance to try bzr-colo, so please take what's below
with a grain of salt.

There's an important (for me) use case of working in parallel on two or
more branches all of them bound to a remote repo.  E.g., a development
trunk and a release branch.  With separate bound branches I use now, I
can easily keep each one reasonably well synchronized with upstream,
with switching to the other branch just a "cd ../BRANCH; bzr up" away.
If both branches are not far behind the upstream, this is almost
instantaneous, and the subsequent build is very fast because most of
the build products are present and up to date.

By contrast, with colocated branches, I will have to "bzr switch" to
the other branch, which will probably be slower than just a "cd".  The
equivalent "git checkout BRANCH" takes between 6 and 10 seconds in the
Emacs repository on a fast GNU/Linux machine (there are about 2.7K
files in the tree); how much would "bzr switch" take on a slower
machine or on Windows?  I just tried "git checkout" on Windows, and it
took me a whopping 81 seconds, most of it doing I/O (Windows
filesystems don't come close to the speed of Linux filesystems).

And how long will it take to build the branch I switched to, what with
most products made outdated by the switch?  In Emacs, I'd probably
need a full bootstrap, just to be sure I don't have subtle bugs by
mixing products from different branches that diverged quite a bit.

I guess syncing with upstream will also fetch all the colocated
branches under this model, right?  If so, it will be slower, while bzr
is just barely fast enough already.  With separate branches, I update
a branch only when I need to work on it or build it for testing.

And then there's the potential confusion wrt which branch I'm working
on at any given moment (with separate trees I just need to look at the
shell prompt).

So perhaps colocated branches are a fine default for small projects,
but I'm not sure they scale up well with projects such as Emacs, which
has around 104,000 revisions and whose full build takes minutes,
sometimes even half an hour.

Therefore, I hope that (a) the possibility of having standalone
branches will not disappear from bzr, and (b) that working in such a
standalone branch will not become harder or slower than it is now.

> People working on large trees really want to have just one on-disk
> checkout at a time

Well, that's not true, at least not for me.  For example, I sometimes
compare how things are different between two branches, to figure out
why something that works in one doesn't in another.  This is not just
"bzr diff" (which, I presume, will fully support comparison of files
with another colocated branch, even if it is not checked out).
Sometimes, I don't know what is different, e.g. which files were
added/removed, and need to look around.  Wouldn't this be harder with
colocated branches?

> so they don't use too much disk space

I can't believe someone is still bothered by disk space.

> and so they don't waste time building a new tree just to start a new
> branch.

In practice, you will have to "waste" that time anyway, unless I
misunderstand how the colocated branches will work.  Bazaar cannot
possibly know what build products need to be removed as result of the
switch, so a "make clean" or its equivalent is necessary, and with
some projects much more, like "make bootstrap".

> You can also address any colocated branch from the same
> bzrdir with eg 'bzr diff -r colo:other_branch'.

Is this form of "bzr diff" slower than "bzr diff" and "diff"?