[RFC] proposed user doc for nested trees

Stephen J. Turnbull stephen at xemacs.org
Thu May 7 18:49:22 BST 2009

A couple of quick comments.  I doubt I'll get back to it before next
week, Mother's Day activities are more important than Bazaar!

Ian Clatworthy writes:

 > +Bazaar has good support for building and managing external libraries
 > +and subprojects via a feature known as *nested items*. In particular,
 > +nearly all of Bazaar's commonly used commands understand nested items
 > +and Do The Right Thing as explained below.

In this context I think "Do The Right Thing" is not what you want, as
there is rarely going to be a single right thing, and for many
projects it may vary across nested items.  How about

    Bazaar's commands understand nested items.  Where appropriate, it
    handles them seamlessly with the rest of your tree, while helping
    you coordinate their state in your project with upstream status.

 > +Note: This feature requires a recent branch format such as ``2.0``
 > +or later.

Heaven help us!

 > +Refreshing a nested branch
 > +--------------------------

Insert here

    An important aspect of nested branches is managing "API churn" in
    the upstream project.  For that reason, by default Bazaar will not
    automatically pull in changes.

 > +As bugs are fixed and enhancements are made to nested projects, you
 > +will want to update the version being used. To do this, ``pull`` the
 > +latest version of the nested branch. For example::
 > +
 > +  bzr pull src/lib/sax
 > +
 > +If the latest revision is too unstable, you can always use the ``-r``
 > +option on the ``pull`` command to nominate a particular revision or tag.
 > +
 > +Now that you have the required version of the code, you can make
 > +any required adjustments (e.g. API changes), run your automated tests
 > +and commit something like this::
 > +
 > +  view src/lib/sax/README
 > +  (hack, hack, hack)
 > +  make test
 > +  bzr commit -m "upgraded SAX library to version 2.1.3"

I think there should be options/configurations that allow updating
groups of nested items.  Eg, you release v2.0, and branch 2.x for
maintenance.  The trunk continues on as pre-3.0.  Now is a good time
to switch your dependencies en masse to upstream development versions,
and in 6 months time you switch them back to stable (if they haven't
released in the interim), etc, etc.

This is related to but not the same as "pegging" as you describe below.

 > +Changing a nested branch
 > +------------------------
 > +
 > +As well as keeping track of which revisions of external libraries
 > +are used over time, one of the reasons for nesting projects is to
 > +make minor changes. You may want to do this in order to fix and
 > +track particular bugs you need addressed. In other cases, you may want
 > +to make various local enhancements that aren't valuable outside
 > +the context of your project.
 > +
 > +As support for nested branches is integrated into most commonly
 > +used commands, this is actually quite easy to do: simply make
 > +the change to the required files as you normally would! For example::
 > +
 > +  edit src/lib/sax/parser.py
 > +  bzr commit -m "fix bug #42 in sax parser"
 > +
 > +Note that Bazaar is smart enough to recurse by default into nested
 > +branches, commit changes there, and commit the new nested branch tips
 > +in the current branch. Both commits get the same commit message.

I disagree with this default, at least to start with.  I do not think
Bazaar should by default commit changes to the nested branch at the
same time.  In fact, I'm almost tempted to say that Bazaar should
refuse to commit if both the main project and nested branches both
have changes in them, unless a forcing option is used.  This is
especially true given the asymmetric behavior you propose later for

If users want this default, then it can/should be changed later.

 > +Reviewing nested branch changes
 > +-------------------------------
 > +
 > +Just like ``commit``, the ``status`` and ``diff`` commands implicitly
 > +recurse into nested branches. In the case of ``status``, it shows both the
 > +nested branch as having a pending change as well as the items within it
 > +that have changed. For example::

I think that nested items should have an additional value for status,
namely "diverged".  This means that they are up-to-date in your tree,
but different from upstream.

 > +Undoing nested branch changes
 > +-----------------------------

 > +If you ``uncommit`` the current branch, then just that commit is undone
 > +and the commit to the nested branch is left intact. The reason for this
 > +behaviour is simple: Bazaar doesn't know whether the commits were done
 > +as multiple steps or not and whether you want one or both commits undone.

Urg.  "In the face of ambiguity, refuse the temptation to guess."  If
Bazaar doesn't know, it shouldn't try to be smart.  I think this
should be an error unless the user specifies.

I doubt that will fly in this crowd, so as an alternative, I suggest
that Bazaar should warn (ie, be a little bit loud in the messages it
emits about what it's doing) about nested branch commits that are left
"pending containing-branch commit" after the uncommit.

Technically, I don't see why it would be hard to add a notation to the
nested branch commit that it was part of a higher level commit.

 > +Relative locations are often more useful than absolute locations
 > +because they:
 > +
 > +* Make it easier to move a related set of projects.
 > +* Imply the transport used to access nested branches.

Maybe add:

    * Are convenient for referring to a nested branch in the same repository.

 > +Virtual projects
 > +----------------
 > +
 > +By design, Bazaar is strict about tracking the actual revisions used of
 > +nested branches over time. Without this, projects cannot accurately
 > +reproduce exactly what was used to make a given build. There are
 > +isolated use cases though where is advantageous to say "give me the
 > +latest tip of these loosely coupled branches". To do this, create a
 > +small 'virtual project' which is just a bunch of *unpegged* nested
 > +branches. To mark nested branches as unpegged, use the ``--no-pegged``
 > +option of the ``nested`` command like this::
 > +
 > +  bzr nested --no-pegged [DIR]
 > +
 > +To stop the nested branch tips from floating and to begin recording
 > +the tip revisions again, use the ``pegged`` option::

Urg.  "Begin recording again"?!  Don't you mean, "requiring explicit
updates from upstream again"?

 > +
 > +  bzr nested --pegged [DIR]
 > +
 > +After changing whether one or more nested branches are pegged or not, you
 > +need to ``commit`` the branch to record that metadata. (The pegged state
 > +is recorded over time.)

I don't see why this shouldn't be done automatically.  OTOH, does this
mean that the pegged state propagates to anybody who is pulling from you?!

I like the interface, but the defaults need careful thought.

 > +For example, you may be managing a company intranet site as a project
 > +which is nothing more than a list of unrelated departmental websites
 > +bundled together. You can set this up like this::
 > +
 > +  bzr init intranet-site
 > +  cd intranet-site
 > +  bzr branch --nested bzr://ourserver/websites/research
 > +  bzr branch --nested bzr://ourserver/websites/development
 > +  bzr branch --nested bzr://ourserver/websites/support
 > +  bzr branch --nested bzr://ourserver/websites/hr
 > +  bzr nested --no-pegged

IMHO this should be an error.  "bzr nested --no-pegged" should default
to DIR=".".  To float all the nested branches, there should be an
"--all" option.  The rationale is that recovering from a failure to
unpeg is easy: "bzr pull DIR && bzr nested --no-pegged DIR && bzr commit".
Recovering from an inadvertent unpeg is likely to involve panic,

bzr pull        # I thought I was just getting your changes to the project.
# get coffee, don't notices that 178 branches of xorg get updated over
# our new terabit connection :-)
make -k && make test    # hope springs eternal ....
# get 6.2 GiB of error messages from the C compiler!
# A good time *will* be had by all!

 > +Commands like ``commit`` and ``push`` need online access to the locations
 > +for nested branches which have updated their tip.

'pull' and 'push', obviously, but why does 'commit' care?  Are we not

 > +In particular, ``commit``
 > +will update any changed nested branches first and only commit to the
 > +containing branch if all nested branch commits succeed. If you are working
 > +offline, you may want to ensure your have a local mirror location defined
typo ----------------------------------+

 > +for nested branches you are likely to tweak. Alternatively, the
 > +``no-recurse-nested`` option to the ``commit`` command might to useful to
 > +commit some changes, leaving the nested branch commits until you are back
 > +online.
 > +
 > +A given top level branch cannot contain multiple copies of a nested
 > +branch.

Technically speaking, why not?  And why announce this now, in a
"desiderata" piece?

 > +In most programming environments, having different parts of the project
 > +using different versions of a library is an integration no-no anyhow,
 > +so enforcing *one* common revision is the right way to prevent this from
 > +happening.

Au contraire.  People who *want* to do this probably know exactly what
they are doing.  For example, the autofools have historically required
multiple versions to be installed to keep their various clients happy
(at one point I had *three* Debian versions of autoconf, and *five*
Debian versions of automake, plus one each of upstream CVS installed
on my Debian box).  If it can't be done for technical reasons, that's
one thing, and if you don't feel like implementing it now because you
doubt it will ever be useful, that's another, but please don't tell me
not to do what I need to do.

More information about the bazaar mailing list