On different ways to combine trees

Piotr Kalinowski pitkali at gmail.com
Tue Jul 23 16:28:55 UTC 2013


I've been recently playing with different methods of combining trees
into a larger entity, including externals, scmproj, as well as builder
plugin. I'd like to share my thoughts and ask for any opinions in case I
have missed something, as I have not been touching bazaar for years.

The small scale use case is the ~/.bazaar directory with various plugins
branched from Launchpad. Large scale use case is merging configuration
for more applications into one `project.'

(Disclaimer: yes, I've heard that bazaar development has effectively
stalled. I'm bored, though, and since I had to use bazaar recently in
hopes of contributing a patch to some emacs library, this is what drew
my attention.)


First I tried using externals. It's nice, it does not use other
commands, which apparently imposes certain restrictions, and it actually
works. Well, kind of. I could branch, I could pull the whole thing
without writing manual scripts or shell loops iterating over plugins/

However, push would result in reporting a whole lot of errors, because I
don't have write access to any of the plugins. Pushing the super-project
still worked though, so if you can live with something like this, the
small scale use case is essentially solved.

This could be made better by trying to push only for those externals
which require this. But that would mean first comparing each and every
one of them with the pushing location, and initiating push only if
something is missing. That might sound like a good idea, at least until
you realise that's what push operation is doing anyway.

There's actually a bug on Launchpad for bzr-externals that it tries to
do push even if external uses a read-only transport. Fixing that in some
way would certainly be good, but in my case of using lp:XXX URLs I
actually ended up with bzr+ssh URLs in configuration. The push is
correctly refused by remote, but the transport is clearly read-write.
(Then again, fixing this bug *and* changing URLs for read-only ones
would certainly eliminate error messages too.)

So I thought for a moment about extending externals plugin with a notion
of `pull only' externals that would just be skipped when pushing parent
tree. That would immediately solve the problem for small scale use case
even without meddling with URLs of individual nested trees. I haven't
followed up on this one though. At least not yet.


I remember when bzr-scmproj used a separate branch for tracking control
information, and experimenting with this, even at the office (anything
is better than interacting directly with CVS, which we were using at the
time ;).

Now it uses files in .bzrmeta, and has much simpler design. I think that
in general those changes are for the better. And since it uses a
separate set of commands, it solves my small scale problem, because I
can manually push only those trees that a) require this, and b) I know I
can push them.

(That is, after I fixed a bug in project-add that I used in a shell loop
to quickly change externals setup into a scmproj setup.)

Then I read some more, and it is now my understanding that
project-commit saves a snapshot of all subtrees, and project-update
updates only to registered snapshot, and if I really want to update all
the subtrees from respective sources, I should do something like

$ bzr pcmd pull

Well, that's easy enough, although in general it seems that scmproj is
now pretty much the same thing as externals, with only difference being
that it is implemented as a separate set of commands instead of hooks.

I could live with that for my small scale use case, but I am a pretty
curious person, so I proceeded to play with the idea more. I also like
to speculate about more general cases, which may or may not apply to the
use cases I have on hand.


For bzr-builder, I just checked that I could use it for checking out
configuration, and updating it. That would be good enough for my small
scale use case, but for the large scale one I'd like to have coordinated
pushing of various subtrees into their respective remote locations, for

*What I'd really like (and might code)*

I may be biased, but I was thinking of something more like guestrepo
extension that Mercurial has. So I'd like the ability to define
subtrees, and then specify which revisions get for some of them, but
perhaps mark some as always corresponding to a certain branch, not a
specific commit.

But then if I changed my mind at some point, or if I were releasing some
piece of software developed in an environment like that, I could easily
register current snapshot of the whole thing (or maybe also selectively
register specific revisions for some subtrees).

This would include project/external-command support, ability to
pull/push subtrees, inspect state of the whole work area with respect to
project configuration (including a snapshot), and so on.

An interesting extension would be to imitate git submodules in a certain
way to get subsets of subtrees, so that you could get a work environment
with only some subtrees actually `checked out'. The git submodules
imitation part is that user would locally decide what they want, instead
of storing predefined subsets of subtrees in some configuration
(although it could be useful to allow easy choice of many subtrees,
maybe a bit like guards in Mercurial queues work).

For ultimate flexibility this would use a separate command set
(preferably a small one) instead of hooks. Of course this means lack of
easy GUI support, and lots of room for developers making errors, like in
case of git submodules. But at least I get conceptual consistency and
fewer limitations ;)

*Closing words*

I'm thinking about implementing such a mesh of submodules/guest
repos/scmproj as a new plugin. I think that starting fresh in this case
makes sense. But I wanted first to write to the list in case someone had
a much better idea, or was on the verge of publishing something similar
themselves, or whatever. Or maybe something good enough already exists,
and I just don't know how to search for it.

Though in long term, for my large scale use case I want to extend the
idea with ability to get those subtrees from different version control
systems. Of course, I could theoretically just use bidirectional bridges
between bazaar and other version control systems, if they worked. Or I
could fix those bridges, but that is much beyond what I need, or want to
do right now. So this would move a bit into direction of config-manager,
which last time I checked did not advertise support for Mercurial or git
(relying on bzr-git and bzr-hg? not caring?), and was not even a proper
plugin. (Yes, there's no hard requirement to make this a plugin. I just
want to make a plugin out of it. Because I can.)

If nothing else, I'm sure this would be an interesting programming
exercise ;)

Piotr Kalinowski
Intelligence is like a river: the deeper it is, the less noise it makes.

More information about the bazaar mailing list