help getting a clue about tracking changes in an integrated library

Wed Sep 22 07:56:08 BST 2010

Thanks for the detailed response.  I need to digest it, but here are 
some questions/comments:

1.  the "poor man's nested branches" approach seems scary/hacky?

2.  if I have my main code in a shared repo, and then as siblings I have 
the lib-upstream and lib-modified, then I can branch lib-modified into 
the main code at an arbitrary location?  In other words, your step 15 in 
the "monolithic" approach, I can do something like this:

shared repo -+- trunk -+- src
              |         |
              |         +- lib
              |
              +- lib-mod
              |
              +- lib-orig

where lib is a branch of lib-mod?  If so, this seems like what I want, 
assuming I can do all the merges correctly, and bzr tracks the 
relationship between lib <-> lib-mod <-> lib-orig.

3.  I assume I can have multiple different libs handled this way, as 
long as they're all in the shared repo?

4.  To your point below, I do have history in the main project, and I 
want to preserve that obviously, so I'll need to move that into a shared 
repo.  I also push that out to a server...can I push the whole shared 
repo out the same way?

5.  The loom thing sounds interesting, yet also scary.

Thanks,
Chris

On 2010/09/21 19:35, Stephen J. Turnbull wrote:
> Chris Hecker writes:
>
>   >  - lib-1.0 is the released version on the internets.
>   >  - my friend has modified lib-1.0, so it's kind of a branch of
>   >    lib-1.0
>
> It *is* a branch, by definition, once you have both in the VCS.  Maybe
> you mean "fork" (which is a "socially-defined" term, meaning a branch
> of indefinite expected lifespan maintained outside of the parent
> project)?
>
>   >  - when lib-1.1 comes out, I'd like to be able to get those fixes
>
> So far, no technical problem.
>
>   >  - I want lib to be checked into my main source tree, not some other
>   >    directory I have to sync
>
> I assume that by "integrated" you mean this kind of structure:
>
>      top -+- doc
>           |
>           +- src
>           |
>           +- lib<== the code maintained outside your project
>
> This is going to be a dance in bzr, but closer to "Flashdance" than a
> waltz.  (What you need is "nested branches", and bzr doesn't do that
> yet, although it's a frequently requested feature.  See the "Roadmap
> for Bazaar..." thread.)
>
> There are two basic approaches available at present, "poor man's
> nested branches," and "monolithic branches".  The poor man's nested
> branch works like this:
>
> 1.  Initialize a shared repository.[1]
>
> 2.  Set up a "trunk" branch of your project containing everything but
>      the lib subtree.  (How to do this will depend on how much history
>      of your project needs to be preserved.)
>
> 3.  In that branch, "bzrignore lib".
>
> 4.  As a *sibling* of your branch, create a lib-upstream branch:
>      copy the lib-1.0 code there, bzr init, bzr add, bzr commit, bzr
>      tag.[2]
>
> 5.  As a *sibling* of lib-upstream, branch from lib-upstream into
>      lib-modified, apply your friend's patches/copy the code into
>      lib-modified, bzr add/rm/mv as appropriate, bzr commit, bzr
>      tag.[2]
>
> 6.  cd into "trunk", and "bzr branch ../lib-modified lib".
>      *Not* "bzr checkout ...": using an independent branch means you
>      can hack in ./lib without worrying about "polluting" your
>      upstreams.
>
> This is probably more intuitive, but the relationships among the
> branches are non-trivial, and depending on your work habits, it may be
> easy to forget to commit your local changes in trunk/lib.  ("cd trunk;
> bzr status" won't remind you about them, for example.)  If you're
> never going to make such changes, then it's no problem. :-)
>
> Ongoing workflows:
>
> a.  Normal development: work in trunk as usual: hack, commit, release.
>      For experimental work create feature branches, for release
>      maintenance create maintenance branches.
>
> b.  Friend releases bugfixes etc: apply to "lib-modified" branch, then
>      merge to the "lib" branch in trunk, then merge to any descendents
>      of trunk.  *Your friend must not be including any upgrades to
>      upstream in these patches.*[3]
>
> c.  Upstream releases new versions: apply to "lib-upstream" branch,
>      then merge down the cascade via "lib-modified", "trunk", and any
>      feature or maintenance branches.
>
> It is possible to do all the work in the same workspace.[4]
>
> For the "monolithic" approach, you need to set up two branches, and
> maybe three, I think, one which is used only for syncing to upstream
> "lib", one which is used for syncing to your friend's version, and one
> for your own work.  If you're starting from scratch (no history in
> your project), then I would
>
> 1.  Initialize a shared repository.[1]
> 2.  Initialize a branch "lib-upstream" in the shared repository.
> 3.  Untar lib-1.0 in lib-upstream.
> 4.  Rename the resulting subdirectory as desired.
> 5.  Add all the files to bzr.
> 6.  Commit.
> 7.  Tag the commit.[2]
> 8.  Branch "lib-modified" from "lib-upstream" in the shared repository.
> 9.  rm -rf the upstream version of the lib directory.  (I'm assuming a
>      tarball for your friend's version.  If not, modify steps 9 and 10
>      appropriately.)
> 10. Untar lib-1.0-modified.
> 11. Rename to the name used in 4.
> 12. "bzr add" any new files, "bzr rm" any deleted files, "bzr mv" any
>      renamed files.
> 13. Commit.
> 14. Tag the commit.[2]
> 15. Branch "lib-modified" to "trunk" (your main development branch).
> 16. Add the files for your project to trunk.
> 17. "bzr add" them, and commit.
>
> Ongoing workflows:
>
> a, b, and c as above, except that when merging to trunk, you merge
> directly to trunk, not to the lib subtree.
>
> Rationale:
>
> For best results in getting help from your friend and/or the "lib"
> upstream project, it's essential to know what changes have been made.
> Having a separate branch for each version puts this information at
> your fingertips, at a slight cost in overhead for each merge from
> upstream.
>
> Plausible variants:
>
> 1.  If you already have history in your project, you need to import
>      that to bzr in the shared repository after step 1.  I suggest
>      renaming that branch to "temporary".  Then instead of initializing
>      "lib-upstream", branch it from "temporary" at step 2.  Then remove
>      the contents of the lib directory (keeping any local patches, ie,
>      those not in your friend's modified version, somewhere safe).  Now
>      proceed as in 3--15, and instead of 16 as above, restore your
>      local patches in trunk.  Finally, remove the temporary branch.
>      (Not relevant to "poor man's nested branches.")
>
> 2.  Proceed as in variant 1, then do development in "lib-upstream",
>      and merge to "lib-modified" and "trunk" for actual use.  This is
>      the best of all worlds from the point of view of communicating
>      with upstream; you can test and demonstrate bugs with a pristine
>      version.  This only works if your friend's modifications preserve
>      enough of lib's API that you can actually run such tests.
>      (Not relevant to "poor man's nested branches.")
>
>      Disadvantage: you have to do the "cascade merge" after every
>      commit to test the version you are producing.
>
> 3.  Use "lib-modified" as "trunk".  Very plausible if you make no or
>      "almost" no changes to lib-modified.
>
>      Disadvantage: almost none, as if you decide to make a
>      "significant" change to lib-modified, you can always branch at
>      that point.  You can also branch "retroactively" if you decide
>      several commits later.  (This requires a rebase, and if you run
>      into that situation you probably want this list's advice at the
>      time.  I'm just noting the possibility here so that you don't
>      worry -- it's not going to be a big deal if you need to do it.)
>
>      I just don't like this variant because I'm anal-retentive about
>      recording history in separate branches. :-)  I also wonder if it's
>      a good idea in "poor man's nested branches", but haven't thought
>      it out.
>
> 4.  There are also "rebasing" workflows that may make sense if changes
>      to lib-upstream or lib-modified become frequent.  bzr is not very
>      good at rebasing, though (where in this assessment I include a
>      scarcity of documentation on these workflows, and a general anti-
>      rebasing philosophy prevailing in the Bazaar community).
>
>      Advantage: better separation of your changes from upstream.
>
> 5.  There are also "loom"-based workflows that may make sense if
>      changes to lib-upstream or lib-modified become frequent.  Looms
>      are not very well documented, at least at the tutorial level, but
>      I know at least two heavy users of looms (Robert Collins, who
>      maintains the feature in Bazaar, and the FLUFL, who if you know
>      what that means you probably know him well enough to ask ;-) who
>      might be willing to help.  Looms are much more in keeping with
>      Bazaar philosophy than rebase, but the educational resources are
>      much poorer.  Ie, there are lots of good texts on rebase on the
>      Web.  OTOH, Stacked Git is somewhat like looms, but really looms
>      are a unique feature of Bazaar, so there's not that much wisdom
>      you can borrow from other communities.
>
>      Advantages: better separation of your changes from upstream, and I
>      believe much more automatic than the main workflows described
>      above.
>
> 5.  An alternative to "looms" is "pipelines".  As the names suggest,
>      pipelines are more linear than looms, and I'm not sure they would
>      apply to your use case.  Aaron Bentley maintains them, IIRC.  Like
>      looms, they are minimally documented.
>
>      Advantages: better separation of your changes from upstream, and I
>      believe much more automatic than the main workflows described
>      above.
>
> Footnotes:
> [1]  "bzr help init-repo" for more information.
>
> [2]  Not essential, but very useful for error recovery if you put
> stuff in the branch that shouldn't be there.
>
> [3]  The problem is that bzr cannot know that the upstream changes
> included in your friend's patches are the same ones that you are
> bringing in when you update the lib-upstream branch.  This is likely
> to result in messy merge conflicts.
>
> [4]  "bzr help checkout", especially the --lightweight option, and
> "bzr help switch", for more information.  Also Google the wiki for
> "colocated branches" (the archives will have tons on this, but mostly
> in the form of feature requests and bzr-vs-git discussions; not very
> useful in practical application).
>
>