looms v. rebase (or, Where are the blogs?!) [was: Re: Will re-basing support be added into Bazaar core ?]

Wed Apr 22 10:26:53 BST 2009

Andrew Bennetts writes:

 > If your new tip does have references to those original revisions,
 > but those references are part of something other than the ancestry,
 > then it seems to me like you have a more complicated model.

No.  All models that capture the situation are complex.  If you think
one is simpler than the others, then you simply have a limited
repertoire of operations that you want to do, and that model happens
to support them.  There's nothing wrong with users wanting simplicity,
but Bazaar claims to be flexible, and it "should" be able to make my
life as simple as git does, not just make your life simple.

FSVO "should", YMMV.

 > That might be reasonable, but it's far from clear to me that the
 > extra complexity is obviously superior.

Who said it's "superior"?  All I'm arguing is that rebase is a useful
optimization and not hazardous, let alone evil.  (I know, you never
said "evil".  But others did.)  And with git-style traversal
operators, the complexity is as manageable as you want it to be.

 > Or to put it another way: if a revision has two parents (the point
 > in trunk you are based off, and the original revision before
 > rebasing), why not record that as a merge revision?

In a long-lived branch that makes it hard to see what's going on.  You
have to subtract out all the merges when diffing.  Sure, as David
Strauss argues, Bazaar could do that for you, but isn't Bazaar slow
enough for you, yet?  I'm perfectly happy with Bazaar's slowness,
thank you, and don't care if nobody ever applies a slowdown patch to
it again!

OTOH, rebase optimizes this very common operation.

 > I do not appreciate your strawman here.  I never said that anyone was
 > recommending rebase as a tool for all seasons.

I'm not going to bother to determine whether you ever did before,
because you just did, as quoted immediately above.  "It's far from
clear to me that the extra complexity is obviously superior."

 > I have not used the word "evil".

Others have, and much of what I have written is in response to them.
I apologize for not keeping your messages separate from the rest of
the thread, but I'm not paid for my efforts to disentangle the current
hairball.  I'd be happy to accept a small honorarium in return for
ensuring more accurate citations, of course.

 > If you want to debunk those claims please direct your mail to to
 > the inbox of someone that has actually made those claims rather
 > than put words in my mouth.

Er, I did.  bazaar at lists.canonical.com.  As for what did or didn't
come out of your mouth, if you reread what I wrote, I think you will
discover that I didn't claim that *you* wrote any such things, merely
that I was responding to them.

 > What I do claim is that rebase alters the history of a branch.  In
 > bzr (the topic of this mailing list) this is clearly true,

I used to think that's true, and that it's a bug in Bazaar design,
which I advocate fixing.  Bazaar should not have *any* operations that
lose history implicitly.  It should be trivial to recover *all*
"dropped" history until GC, and GC shouldn't touch anything younger
than the user's most recent term of employment or so.<wink>

Except that it's not clearly true, as both Robert and I have
demonstrated.  It *is* possible to maintain multiple heads in a Bazaar
branch, just very tedious, so operations similar to rebase (preserving
history) are apparently possible.

 > as there is no way to make a revision part of a branch except via
 > the branch's ancestry.  So performing an operation on the branch
 > that results in a new ancestry that does not contain that revision
 > is removing history from that branch.  You could add a revision
 > property to name a revision not in the ancestry, but that won't
 > automatically lead to those revisions being copied by push, pull,
 > send or merge, so they may as well not be there, because other
 > people will never receive those revisions.

This is all a detail of the current implementation, which could be
changed if that change were an improvement.  Isn't improving bzr one
of the purposes of this list?

 > This is not a problem with a loom.

But the only documentation I could find for looms is bzr help loom,
which is entirely lacking in theory of operation, and

bazaar-vcs.org/Documentation/LoomAsSmarterQuilt

It's hard to discuss something documented like that.  Rebase, on the
other hand, is a tried and true tool that's well-documented in many
tutorials and FAQs.

 > Even in git, with its implementation of rebase and the "flexible
 > history-traversing" you describe, it still sounds like the original
 > revisions are second-class history.

Yes.  That is the *intent*, to deprecate the old history in favor of
the history as rebased (and presumably retested, because Our Hero is a
Responsible Programmer).  The rebased history is *better* for the most
frequently occurring purposes in Our Hero's opinion.

 > They aren't transferred yet, although you bet they will be soon, so
 > those commits can easily be lost (perhaps you'd object less to
 > "history-losing" than "history-destroying").

"Soon" is under the control of the user.  By default it's 60 days.
Note that if you have a reasonable backup regime, you probably have 10
copies of that history.  And those are true, verifiable copies by
construction.  If you can name it, you can verify it.  "Lost"?  I
suppose so.

 > You say the rebase proof-of-concept had a flaw that would make
 > those commits garbage, the loom proof-of-concept has no such flaw,
 > because the design is different.

I misspoke.  *git* had a design flaw.  Now by default *all* updates to
a ref (including commits but also forced branch renames, deletions,
and moves) are reflogged.  It's just that rebase made this flaw of not
keeping the names of maybe-garbage around for a while real obvious.

Loom has its own flaws, however, as pointed out in
bazaar-vcs.org/Documentation/LoomAsSmarterQuilt:

    However, it is disastrous to perform a partial commit in
    feature-foo and then going up-thread, as the remaining changes are
    suddenly combined with any pending merges resulting from moving
    up-thread. Thus, if a partial commit is performed, I first shelve
    any remaining changes before going up-thread[.]

    If I forget to shelve changes before moving up-thread with pending
    merges, the remaining uncommitted changes become intertwined with
    the pending merge, and can potentially be difficult to
    extricate. This can be a frustrating situation and is one of the
    primary warts of using loom as a quilt replacement.

 > Further, it seems to me that rebase users often use it precisely
 > because it alters history.
 > <http://www.kernel.org/pub/software/scm/git/docs/user-manual.html>
 > has a chapter titled "Rewriting history and maintaining patch
 > series" to pick one example.  If you run git rebase -i you're given
 > a file to edit with the warning that "If you remove a line here
 > THAT COMMIT WILL BE LOST." Perhaps you need to debunk git's manual
 > and builtin warnings!

git already has both the rebase operation and immutable history.  Not
to mention "the Zeitgeist".  I think it's rather Bazaar that can
benefit from my input, if it cares to.

You should also understand that sophisticated git users now all
understand that it's possible to recover from anything a user can do,
as long as they don't touch anything under .git/objects.  So those
warnings refer to the kinds of cuts and bruises you get playing
basketball with the big boys, not to getting run over by a bus.  The
shouting is all about reducing FAQs on the mailing list, not
preventing destruction of valuable data.

 > The sense in which history is not "altered" or "destroyed" by
 > rebase seems pretty irrelevant to how and why many (most?) people
 > use rebase.

Yes and no.  The *intent* is to deprecate the old and exalt the new.
Nevertheless, we're all human, and occasionally rejoice that not only
"what was lost is now found!" but that it only took four characters to
do it ("@{1}").

 > >  > Loom does preserve history.  It strictly only adds to the DAG.
 > > 
 > > This is true of git as well.  "Branches" in the sense that you think
 > > of them---as a strong association among a name, a working tree, and a
 > > linear history of development---*don't exist* in git.  There are
 > 
 > I am aware of this.  I'm not sure if you do understand what bzr-loom does.

No, of course not.  My workflow involves branching on the order of
every fifteen minutes, and a commit on every save.  No Python-based
tool can keep up with that, simply executing the interpreter and
importing a couple of modules implies non-negligible delay.  So it
would take a big change in workflow, substantial time, and effort to
try looms in practice.

I tried to find out what loom does from the documentation, but
everything I could find (except for a few vague testimonials from
fans) indicates that it looks a lot like Mercurial queues.  The
results of a more systematic check are below.

 > > ahistorical and can't even represent the DAG AFAIK.  git provides no
 > > association: there's no easy way to find the former referents of
 > > rebased or otherwise reused refs, and they don't appear in gitk.
 > 
 > Looms do not have this flaw.

Oh?  What's the graphical UI for looms equivalent to gitk?  Is there a
publicly available moderately complex loom and/or script to produce
one for me to look at?  jamesh (see below) draws a nice picture but
it's not a screenshot.  If "none", I guess you're talking about "looms
don't have the feature so the feature can't be buggy."<wink>

 > You seem to be arguing that they are theoretically equivalent, but
 > clearly they are not so in practice!

First off, the HOWTO

http://bazaar.launchpad.net/~bzr-loom-devs/bzr-loom/trunk/annotate/head:/HOWTO

makes the straightforward use of a loom look very much like "rebase
with 12 bzr commands instead of one git command".  With modern gits
that check for identical changes, you even get the effect described by
"When upstream have merged a patch" automatically.  (AIUI, anyway, my
workflows rarely encounter duplicate patches so I can't really say
from experience.)

Second, there is hardly any practice for looms.  Yeah, I know, Barry
and you and Robert love them to death.  The result of that love?  One
trivial example ("here's how you can turn a 'shelve'-'unshelve' pair
into about 6 commands with a loom!") in PEP 374, and the not entirely
reassuring blog entry cited above.  Googling for "loom
site:bazaar-vcs.org" returns NINE entries, of which only that blog
entry seems pertinent.

Googling for "bzr loom" gives only three blogly results in addition to
the above blog and the HOWTO in the first 50 results:

http://glyphy.com/bzr-loom-2008-07-07
http://www.flamingspork.com/blog/2008/02/22/bzr-loom-a-bzr-plugin-with-quilt-like-functionality/
http://blogs.gnome.org/jamesh/2008/04/01/bzr-loom/

*None* of them give a useful example that couldn't be done just as
well with rebase.  Certainly, you can *imagine* from the descriptions
that looms are more flexible, but AYGNI?  This is a major missed PR
opportunity if looms are all that you imply they are.

 > >  > The only exception is when you a remove a thread;
 > > 
 > > How does this map to "rebase"?  Or is it irrelevant to the loom
 > > vs. rebase debate, except in the sense of a feature possessed by loom
 > > that rebase can't even represent?
 > 
 > It is like dropping a patch from the patch series.  In rebase terms it would
 > be equivalent to e.g. "take revisions 1, 2, 4 and 5 (and forget 3)".  I
 > assume git's rebase -i allows this.

It does, and you can also use rebase --onto to slice out a sequence of
revisions.  Do it twice to get the same effect from the command line.

So, back to the loom.  Where does the thread/patch go when you do
that?  Is there a way to recover it if you change you mind later?

 > > I don't understand the point.
 > 
 > The following sentence states the point:
 > 
 > >  > If you have a change, you have the revision where that change
 > >  > originated.

That sentence still has no meaning except "I like looms", because you
haven't explained what "originated" means or why anybody would care.

 > > You do?  In general, "the origin" might be a patch against a different
 > > branch with a very ambiguous relation to any of the "official"
 > > branches in the project (this happens in XEmacs and Python development
 > > all the time).  I don't understand what you mean by "*the* revision",
 > > or how you "have" it.
 > 
 > To be utterly precise: yes obviously applying plains diffs and similar
 > actions will of course lose revision metadata.  This is independent of
 > rebase vs. loom though.

Hey, you're the one who brought up "revision where that change
originated".  Let me tell you one reason why one might care.  If you
come up with a security patch in Python 3 and by some very good luck
it applies and works and passes the test suite in Pythons 2.5-2.7,
you're golden.  OTOH, if the patch originates in Python 2.6, then a
forward port is not a foregone conclusion, because it may be
unpy3onic.

 > The difference I am highlighting here is that no loom operation will lose
 > that revision metadata, or disassociate it or downgrade it to some sort of
 > second-class reference that gitk won't notice.  It is a regular part of the
 > branch's ancestry, like any other merge.  This is different to rebase.

But what about *before* it's merged?  It just lives in the loom, and
none of your other tools really know about it, right?

I mentioned my workflow above.  I commit on *every* save, because "git
commit" is on my after-save-hook.  If I see a change that "doesn't
belong" to my current nominal task, I save-and-commit, branch to
cherry-###, make the change, save-and-commit, then checkout the work
branch.  When I reach a "furlongstone" or simply start to forget the
motivation for everything I've done, I merge the work branch onto the
"presentation" branch.  (All this is done by trivial Emacs macros, of
course.  The only thing that's at all complex is the after-save-hook,
and that not very.)  gitk allows me to view both the "big commit" on
the presentation branch and to browse the process by which I arrived
there (occasionally useful in review).  Finally I usually reparent the
current commit to depend only on the presentation branch and delete
references to the working branch.  Sometimes I may reorder and squash
commits (and of course give them meaningful logs) using rebase -i.
That may be done either as part of the premerge review, or as an
afterthought.

The cherries I reap at my leisure.  The state of the orchard is easily
visible via "git branch | grep cherry".

 > > To loom.  It can't refer to rebase, because in rebase there must be a
 > > commit or you can't refer to it via the VCS.
 > 
 > But loom does record it: it's a merge.

Of what to what?  And I thought merges weren't recorded in bzr, but
rather you need to do an explicit commit?

It sounds to me to be suspiciously like a very inconvenient way to
approximate git's DAG traversal.  I suppose with a different workflow
it would be convenient, but to the extent I can imagine it such a
workflow would be just as well-supported by Mercurial queues or StGit.

This is a major PR problem, that there's no 

 > I think you can assume I've read Mark's messages.  <wink>

Yeah, well, after Mark finishes <wink><wink>ing he'll move on to
<nudge><nudge>.  I hope he doesn't break your ribs!<wink>