help getting a clue about tracking changes in an integrated library
Stephen J. Turnbull
stephen at xemacs.org
Wed Sep 22 03:35:52 BST 2010
Chris Hecker writes:
> - lib-1.0 is the released version on the internets.
> - my friend has modified lib-1.0, so it's kind of a branch of
> lib-1.0
It *is* a branch, by definition, once you have both in the VCS. Maybe
you mean "fork" (which is a "socially-defined" term, meaning a branch
of indefinite expected lifespan maintained outside of the parent
project)?
> - when lib-1.1 comes out, I'd like to be able to get those fixes
So far, no technical problem.
> - I want lib to be checked into my main source tree, not some other
> directory I have to sync
I assume that by "integrated" you mean this kind of structure:
top -+- doc
|
+- src
|
+- lib <== the code maintained outside your project
This is going to be a dance in bzr, but closer to "Flashdance" than a
waltz. (What you need is "nested branches", and bzr doesn't do that
yet, although it's a frequently requested feature. See the "Roadmap
for Bazaar..." thread.)
There are two basic approaches available at present, "poor man's
nested branches," and "monolithic branches". The poor man's nested
branch works like this:
1. Initialize a shared repository.[1]
2. Set up a "trunk" branch of your project containing everything but
the lib subtree. (How to do this will depend on how much history
of your project needs to be preserved.)
3. In that branch, "bzrignore lib".
4. As a *sibling* of your branch, create a lib-upstream branch:
copy the lib-1.0 code there, bzr init, bzr add, bzr commit, bzr
tag.[2]
5. As a *sibling* of lib-upstream, branch from lib-upstream into
lib-modified, apply your friend's patches/copy the code into
lib-modified, bzr add/rm/mv as appropriate, bzr commit, bzr
tag.[2]
6. cd into "trunk", and "bzr branch ../lib-modified lib".
*Not* "bzr checkout ...": using an independent branch means you
can hack in ./lib without worrying about "polluting" your
upstreams.
This is probably more intuitive, but the relationships among the
branches are non-trivial, and depending on your work habits, it may be
easy to forget to commit your local changes in trunk/lib. ("cd trunk;
bzr status" won't remind you about them, for example.) If you're
never going to make such changes, then it's no problem. :-)
Ongoing workflows:
a. Normal development: work in trunk as usual: hack, commit, release.
For experimental work create feature branches, for release
maintenance create maintenance branches.
b. Friend releases bugfixes etc: apply to "lib-modified" branch, then
merge to the "lib" branch in trunk, then merge to any descendents
of trunk. *Your friend must not be including any upgrades to
upstream in these patches.*[3]
c. Upstream releases new versions: apply to "lib-upstream" branch,
then merge down the cascade via "lib-modified", "trunk", and any
feature or maintenance branches.
It is possible to do all the work in the same workspace.[4]
For the "monolithic" approach, you need to set up two branches, and
maybe three, I think, one which is used only for syncing to upstream
"lib", one which is used for syncing to your friend's version, and one
for your own work. If you're starting from scratch (no history in
your project), then I would
1. Initialize a shared repository.[1]
2. Initialize a branch "lib-upstream" in the shared repository.
3. Untar lib-1.0 in lib-upstream.
4. Rename the resulting subdirectory as desired.
5. Add all the files to bzr.
6. Commit.
7. Tag the commit.[2]
8. Branch "lib-modified" from "lib-upstream" in the shared repository.
9. rm -rf the upstream version of the lib directory. (I'm assuming a
tarball for your friend's version. If not, modify steps 9 and 10
appropriately.)
10. Untar lib-1.0-modified.
11. Rename to the name used in 4.
12. "bzr add" any new files, "bzr rm" any deleted files, "bzr mv" any
renamed files.
13. Commit.
14. Tag the commit.[2]
15. Branch "lib-modified" to "trunk" (your main development branch).
16. Add the files for your project to trunk.
17. "bzr add" them, and commit.
Ongoing workflows:
a, b, and c as above, except that when merging to trunk, you merge
directly to trunk, not to the lib subtree.
Rationale:
For best results in getting help from your friend and/or the "lib"
upstream project, it's essential to know what changes have been made.
Having a separate branch for each version puts this information at
your fingertips, at a slight cost in overhead for each merge from
upstream.
Plausible variants:
1. If you already have history in your project, you need to import
that to bzr in the shared repository after step 1. I suggest
renaming that branch to "temporary". Then instead of initializing
"lib-upstream", branch it from "temporary" at step 2. Then remove
the contents of the lib directory (keeping any local patches, ie,
those not in your friend's modified version, somewhere safe). Now
proceed as in 3--15, and instead of 16 as above, restore your
local patches in trunk. Finally, remove the temporary branch.
(Not relevant to "poor man's nested branches.")
2. Proceed as in variant 1, then do development in "lib-upstream",
and merge to "lib-modified" and "trunk" for actual use. This is
the best of all worlds from the point of view of communicating
with upstream; you can test and demonstrate bugs with a pristine
version. This only works if your friend's modifications preserve
enough of lib's API that you can actually run such tests.
(Not relevant to "poor man's nested branches.")
Disadvantage: you have to do the "cascade merge" after every
commit to test the version you are producing.
3. Use "lib-modified" as "trunk". Very plausible if you make no or
"almost" no changes to lib-modified.
Disadvantage: almost none, as if you decide to make a
"significant" change to lib-modified, you can always branch at
that point. You can also branch "retroactively" if you decide
several commits later. (This requires a rebase, and if you run
into that situation you probably want this list's advice at the
time. I'm just noting the possibility here so that you don't
worry -- it's not going to be a big deal if you need to do it.)
I just don't like this variant because I'm anal-retentive about
recording history in separate branches. :-) I also wonder if it's
a good idea in "poor man's nested branches", but haven't thought
it out.
4. There are also "rebasing" workflows that may make sense if changes
to lib-upstream or lib-modified become frequent. bzr is not very
good at rebasing, though (where in this assessment I include a
scarcity of documentation on these workflows, and a general anti-
rebasing philosophy prevailing in the Bazaar community).
Advantage: better separation of your changes from upstream.
5. There are also "loom"-based workflows that may make sense if
changes to lib-upstream or lib-modified become frequent. Looms
are not very well documented, at least at the tutorial level, but
I know at least two heavy users of looms (Robert Collins, who
maintains the feature in Bazaar, and the FLUFL, who if you know
what that means you probably know him well enough to ask ;-) who
might be willing to help. Looms are much more in keeping with
Bazaar philosophy than rebase, but the educational resources are
much poorer. Ie, there are lots of good texts on rebase on the
Web. OTOH, Stacked Git is somewhat like looms, but really looms
are a unique feature of Bazaar, so there's not that much wisdom
you can borrow from other communities.
Advantages: better separation of your changes from upstream, and I
believe much more automatic than the main workflows described
above.
5. An alternative to "looms" is "pipelines". As the names suggest,
pipelines are more linear than looms, and I'm not sure they would
apply to your use case. Aaron Bentley maintains them, IIRC. Like
looms, they are minimally documented.
Advantages: better separation of your changes from upstream, and I
believe much more automatic than the main workflows described
above.
Footnotes:
[1] "bzr help init-repo" for more information.
[2] Not essential, but very useful for error recovery if you put
stuff in the branch that shouldn't be there.
[3] The problem is that bzr cannot know that the upstream changes
included in your friend's patches are the same ones that you are
bringing in when you update the lib-upstream branch. This is likely
to result in messy merge conflicts.
[4] "bzr help checkout", especially the --lightweight option, and
"bzr help switch", for more information. Also Google the wiki for
"colocated branches" (the archives will have tons on this, but mostly
in the form of feature requests and bzr-vs-git discussions; not very
useful in practical application).
More information about the bazaar
mailing list