Fixing rebase rather than avoiding it

Fri Mar 5 04:08:01 GMT 2010

Óscar Fuentes writes:
 > "Stephen J. Turnbull" <stephen at xemacs.org> writes:
 > 
 > >  > How could I teach my script to detect such cases and know that the
 > >  > revision that breaks the build is not, precisely, the one that
 > >  > introduced the bug I'm looking for?
 > >
 > > It's for Mercurial, but maybe this bisect that I recently did will
 > > help explain.  The important point is the "if make" in test.sh, which
 > > uses the return code from make to detect a broken build.
 > 
 > [snip]
 > 
 > > Note that "skip" means that "make" failed, "bad" means that the bug
 > > was found.
 > 
 > Okay, broken builds are easy to test. Let's go for a more realistic
 > scenario: a user files a bug report saying that the last update of
 > my compiler crashes with certain code. He adds a nice test case. So
 > I put bisect into action looking for the first revision that
 > crashes with that test case. But it is not so easy, as there are
 > revisions which are middle points in merged feature branches and
 > crashes on that test case too, although they are not the source of
 > the bug.  Actually, some of those revisions may crash with any
 > input.  So bisect is rendered unusable here, unless every revision
 > that ends merged into the master branch have passed a strict QA
 > process.

Granted, in those conditions it's a little harder, but still not
"unusable".  If you actually need to do this *with git* I'd be happy
to discuss elsewhere, but I don't think it's really relevant to this
discussion.

 > Bazaar allows to set a policy that dictates "anything that ends on the
 > left part of the DAG must pass the strict QA process. the rest is not
 > required to follow this policy."

So does any other VCS, though.  bzr provides some additional support
for enforcing that policy AIUI, which may be a convenience for
projects with lots of developer turnover.  However, in both my
Mercurial and my git projects which is the left-hand ancestry is quite
obvious, and git log --merges is a close equivalent to bzr log -n1 in
practice (I rarely have nested merges).  So purely as a matter of
personal habit I end up with left-bisectable DAGs.  I suspect you will
find that most projects have left-bisectable DAGs.  (Although Emacs
seems likely to be an exception, OTOH the relative care that Emacs
developers take with commits means that a full bisect should work.)

Eg, in git, running bisect only on the left-hand ancestry in general
might best be done by creating a temporary branch with git
filter-branch.  (This operation is generic and easily automated.)
Alternatively, for the minimal nesting case I experience, just

    git bisect skip `git rev-list --no-merges bad:good`

gives an excellent approximation.

 > > However, it shows that bisect works fine in the presence of such
 > > breakage.  (In your case, running a similarly bad set of changes would
 > > take a whole day, which would be painful.  But it would work,
 > > eventually.)
 > 
 > I didn't realize that until now. If `bisect' is unlucky, it could end
 > trying a long series of broken revisions.

If it's just a bad run locally, this is easy to fix with an
exponential backoff (if +/- 1 rev both are skipped, try +/- 4 revs,
then +/- 16 revs, etc).  I don't know if any bisection algorithm in
use implements it, though.

 > When that can be expected, a traditional bug hunting session starts
 > looking attractive.

"Expected" means there are lots of bad runs, and, sure, the
traditional methods do look attractive.  But at that point the project
manager needs to call a meeting and say "we're spending a lot of
resources on bug hunting.  How much?  In our project, what would it
cost to make our tree bisectable for future revisions?  How about the
past?"  With those numbers in hand, he can decide what to do.

A similar process would be appropriate for ensuring
"left-bisectability".

Note that the end answer might be "too costly, let's stick with our
current process."  But that would be pretty much true with Bazaar,
too.  The only thing you really save are the managerial resources
involved in supervising the left-bisectable process; if changing
developer workflows is the blocker, that would be true in Bazaar as
well.  Bazaar doesn't turn left-ancestry-muddling workflows into
left-ancestry-preserving ones; it just makes it possible to reject
input with muddled left ancestry.  That's still going to leave that
developer frustrated when he can't commit.