Workflows, rebase, patch theory

Wed May 7 15:17:50 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ben Finney wrote:
| Andrew Bennetts <andrew at canonical.com> writes:
|

...

|> — and if othe people have branched off you, you are headed for
|> massive merge conflicts: there are two branches with very similar
|> changes but no common revisions for those changes. Even Linus has
|> said he doesn't like rebasing: http://lwn.net/Articles/269120/
|
| Thanks. (The article discusses the topic well; Linus's specific
| message decrying "rebase" is at <URL:http://lwn.net/Articles/269210/>.)
|
| One thing I don't understand in Linus's complaint is:
|
|     "Now [in a hypothetical scenario] I've merged (say) 1500
|     networking-related commits by rebasing, but because I rebased on
|     top of Greg's tree that I had also rebased, absolutely *none* of
|     that has been tested in any shape of form."
|
| Isn't part of the point of "rebase" that the working tree ends up
| exactly the same as if a "merge" was done? Why, then, does Linus claim
| "none of [the changed code] has been tested"? Don't all existing tests
| continue to apply, if the working tree is the same?
|

A merge does not guarantee the tests pass. For example:

~  User A, update function foo() to require a variable foo(x)
~  User B, write a new function bar() which uses foo()

In these cases, both branches can have 100% test coverage, and have 100% of the
tests passing. However, when you merge the two, User B's bar() function will not
pass in the new parameter.

Note that there are no conflicts, and both sides had 100% perfect test coverage.

So yes, rebase is effectively taking each commit, doing a merge against tip,
throwing away the merge info, and committing as though it was done fresh.

It is *likely* that if the tests passed before, the tests will pass now. But it
only takes a fairly common operation (any sort of API change) to cause it to break.

I don't believe anyone uses 'rebase' and re-runs the tests all along the way. I
do believe some people use 'rebase' and then through out some of the
intermediate commits, or reorder them, etc to make it a cleaner looking history.

I believe there is a famous mathematician that would always rewrite his proofs
after he finished them. To make them very logical and straightforward. Which
makes them easy to read. But nobody is able to learn *how* he solved these
problems, because all they see is the answer. (Coming from Engineering, "Show
your work" was about 90% of the grade.)

The way the Bazaar project handles intermediate commits is by distinguishing a
"mainline" for a branch, separate from the merged revisions. If you do "bzr log
- --short bzr.dev", every single one of those commits (from 1554 on) has passed
the test suite.

If you want to bisect, you can follow the mainline revisions knowing that
everything should be working at each point. (At a minimum, you shouldn't be
getting syntax errors, etc.) Once you have tracked something down on mainline,
you can then bisect through the merge branch if you want better resolution, with
the caveat that you may run into a broken node.

In my working branches, I will sometimes commit a broken revision, just to make
some forward progress. I don't feel the need to run the full test suite at every
commit, and often the minor patch will end up with a nice annotation about what
it is trying to do, etc.

Git as a project has at least stated the policy that everyone's work should be
created equal, and that a "mainline" is a bad thing. But all of *my* personal
work is not created equal. And I distinguish that by having a dev/feature
branch, versus my own mainline.

And certainly, there are de facto mainlines for the Linux kernel. (You don't run
random Joe Hacker's kernel, you run the stock kernel, or the -aa, or whatever.)

And, arguably, our 'bzr log --short' is a much cleaner view of history:

~ 3398 Canonical.com Patch Queue Manager 2008-05-01 [merge]
~      Add the smart protocol v3 specification to network-protocol.txt

~ 3399 Canonical.com Patch Queue Manager 2008-05-01 [merge]
~      Minor docstring cleanups (Ian Clatworthy)

~ 3400 Canonical.com Patch Queue Manager 2008-05-02 [merge]
~      (robertc) Fix error reporting with bad revision parsing in weave
~        repositories. (Robert Collins)

~ 3401 Canonical.com Patch Queue Manager 2008-05-02 [merge]
~      (mbp) merge 1.4final back to trunk

~ 3402 Canonical.com Patch Queue Manager 2008-05-02 [merge]
~      (mbp,trivial) fix stray comment

~ 3403 Canonical.com Patch Queue Manager 2008-05-02 [merge]
~      (mbp) merge 1.3.1 news into trunk

~ 3404 Canonical.com Patch Queue Manager 2008-05-02 [merge]
~      (mbp) deprecate LocableFiles.get_utf8

~ 3405 Canonical.com Patch Queue Manager 2008-05-03 [merge]
~      (Jelmer) Deprecate Repository.revision_parents().

~ 3406 Canonical.com Patch Queue Manager 2008-05-05 [merge]
~      (robertc) Preserve test ids correctly to aid debugging. (Robert
~        Collins, Andrew Bennetts)

~ 3407 Canonical.com Patch Queue Manager 2008-05-06 [merge]
~      (jam) Make Graph.find_differences() correct,
~        and create a Graph.find_unique_ancestors function.

Single merge revisions that clearly identify what changed, giving you a quick
point to follow if something goes wrong. (I also always label my bug-fix merge
commits with the bug #, though that isn't strict Bazaar policy.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkghug4ACgkQJdeBCYSNAAOB6wCgxH6eJVf2P517KyFxz4os1gy8
yXYAniIqTMjNfMA0GOkFIRB85v1rqABl
=S8k4
-----END PGP SIGNATURE-----