looms v. rebase (or, Where are the blogs?!) [was: Re: Will re-basing support be added into Bazaar core ?]

Wed Apr 22 13:22:43 BST 2009

Stephen J. Turnbull wrote:
> Andrew Bennetts writes:
[...]
>  > That might be reasonable, but it's far from clear to me that the
>  > extra complexity is obviously superior.
> 
> Who said it's "superior"?  All I'm arguing is that rebase is a useful
> optimization and not hazardous, let alone evil.  (I know, you never
> said "evil".  But others did.)  And with git-style traversal
> operators, the complexity is as manageable as you want it to be.

You did, implicitly, by your very vocal defence of rebase.  If you don't
think it's superior, then your efforts in this thread to rebut any possible
impugnment of rebase's merits made by unspecified people is even more
mysterious to me than that it already is.

>  > Or to put it another way: if a revision has two parents (the point
>  > in trunk you are based off, and the original revision before
>  > rebasing), why not record that as a merge revision?
> 
> In a long-lived branch that makes it hard to see what's going on.  You
> have to subtract out all the merges when diffing.  Sure, as David
> Strauss argues, Bazaar could do that for you, but isn't Bazaar slow
> enough for you, yet?  I'm perfectly happy with Bazaar's slowness,
> thank you, and don't care if nobody ever applies a slowdown patch to
> it again!

You seem very confident that you know the causes of Bazaar's slowness.
Please see Robert's mail on this topic.

> OTOH, rebase optimizes this very common operation.

Among other effects.

>  > I do not appreciate your strawman here.  I never said that anyone was
>  > recommending rebase as a tool for all seasons.
> 
> I'm not going to bother to determine whether you ever did before,
> because you just did, as quoted immediately above.  "It's far from
> clear to me that the extra complexity is obviously superior."

I fail to see how expressing scepticism that rebase is a superior solution
in some situations can be understood as a claim that someone recommends it
as a tool for all seasons.

>  > I have not used the word "evil".
> 
> Others have, and much of what I have written is in response to them.
> I apologize for not keeping your messages separate from the rest of
> the thread, but I'm not paid for my efforts to disentangle the current
> hairball.  I'd be happy to accept a small honorarium in return for
> ensuring more accurate citations, of course.

You should apologise to yourself too.  You aren't making your thoughts any
more accessible by entangling them, or helping the odds that the people you
say you are responding to will read and engange with your response.

>  > If you want to debunk those claims please direct your mail to to
>  > the inbox of someone that has actually made those claims rather
>  > than put words in my mouth.
> 
> Er, I did.  bazaar at lists.canonical.com.  As for what did or didn't
> come out of your mouth, if you reread what I wrote, I think you will
> discover that I didn't claim that *you* wrote any such things, merely
> that I was responding to them.

Er, no, you didn't.  My name and email appears in the To: header, and
bazaar@ only appeared in the Cc:.  Your message quotes a message I posted,
and is formatted as paragraph-by-paragraph response to the contents of that
message.  If that's not directed to me then no email is.

As with the release-readiness thread, I am mystified about why you choose to
respond to my messages, directly addressing me, to argue against a position
I have not taken on an issue that is tangential to what I posted about.

>  > What I do claim is that rebase alters the history of a branch.  In
>  > bzr (the topic of this mailing list) this is clearly true,
> 
> I used to think that's true, and that it's a bug in Bazaar design,
> which I advocate fixing.  Bazaar should not have *any* operations that
> lose history implicitly.  It should be trivial to recover *all*
> "dropped" history until GC, and GC shouldn't touch anything younger
> than the user's most recent term of employment or so.<wink>
> 
> Except that it's not clearly true, as both Robert and I have
> demonstrated.  It *is* possible to maintain multiple heads in a Bazaar
> branch, just very tedious, so operations similar to rebase (preserving
> history) are apparently possible.

Yes, as far as I can tell the underlying behaviour of bzr and git here are
very similar, although the UIs differ.

And in both cases, the history of your
branch/line-of-development/flibble-wobblet has *changed*, and that change
has ramifications on fetching those things and viewing their contents.

>  > as there is no way to make a revision part of a branch except via
>  > the branch's ancestry.  So performing an operation on the branch
>  > that results in a new ancestry that does not contain that revision
>  > is removing history from that branch.  You could add a revision
>  > property to name a revision not in the ancestry, but that won't
>  > automatically lead to those revisions being copied by push, pull,
>  > send or merge, so they may as well not be there, because other
>  > people will never receive those revisions.
> 
> This is all a detail of the current implementation, which could be
> changed if that change were an improvement.  Isn't improving bzr one
> of the purposes of this list?

And every Bazaar download could include a free pony, too, if someone would
just make the change...  It's not enough to point at an already recognised
problem and say “you should change that”.  What sort of change do you think
we should make?

>  > This is not a problem with a loom.
> 
> But the only documentation I could find for looms is bzr help loom,
> which is entirely lacking in theory of operation, and
> 
> bazaar-vcs.org/Documentation/LoomAsSmarterQuilt

Yes, the existing Loom plugin has too many sharp edges, and the lack of good
documentation is part of that.  It's still more of a proof-of-concept than a
ready-for-all users tool.  The relevant developers have been busy working on
other things, sadly.  It is useful even in its current state, but you do
need to overcome the poor documentation and have some tolerance for
unpolished UI.

> It's hard to discuss something documented like that.  Rebase, on the

That doesn't seem to have stopped you from making statements about its
capabilities vs. rebase!  So far you've given every impression of being
certain about your understanding of what looms can do, even when developers
dispute your statements.

> other hand, is a tried and true tool that's well-documented in many
> tutorials and FAQs.
> 
>  > Even in git, with its implementation of rebase and the "flexible
>  > history-traversing" you describe, it still sounds like the original
>  > revisions are second-class history.
> 
> Yes.  That is the *intent*, to deprecate the old history in favor of
> the history as rebased (and presumably retested, because Our Hero is a
> Responsible Programmer).  The rebased history is *better* for the most
> frequently occurring purposes in Our Hero's opinion.

And, in my opinion, the rebased history is clearly deficient in some
important aspects, and the tradeoff seems unnecessary to me for most
frequently occurring purposes.

>  > They aren't transferred yet, although you bet they will be soon, so
>  > those commits can easily be lost (perhaps you'd object less to
>  > "history-losing" than "history-destroying").
> 
> "Soon" is under the control of the user.  By default it's 60 days.
> Note that if you have a reasonable backup regime, you probably have 10
> copies of that history.  And those are true, verifiable copies by
> construction.  If you can name it, you can verify it.  "Lost"?  I
> suppose so.

“I have a backup somewhere so my VCS doesn't matter” is a pretty weak
argument for a VCS feature!

[...]
> Loom has its own flaws, however, as pointed out in

Lots of them, sadly.  It needs someone to get some spare tuits...

> bazaar-vcs.org/Documentation/LoomAsSmarterQuilt:
> 
>     However, it is disastrous to perform a partial commit in
>     feature-foo and then going up-thread, as the remaining changes are
>     suddenly combined with any pending merges resulting from moving
>     up-thread. Thus, if a partial commit is performed, I first shelve
>     any remaining changes before going up-thread[.]
> 
>     If I forget to shelve changes before moving up-thread with pending
>     merges, the remaining uncommitted changes become intertwined with
>     the pending merge, and can potentially be difficult to
>     extricate. This can be a frustrating situation and is one of the
>     primary warts of using loom as a quilt replacement.

This is true, although it's hardly a fundamental flaw with the design, it's
just something that needs some UI polish.  Do you have a time machine I can
borrow so I can get more work done?

>  > Further, it seems to me that rebase users often use it precisely
>  > because it alters history.
>  > <http://www.kernel.org/pub/software/scm/git/docs/user-manual.html>
>  > has a chapter titled "Rewriting history and maintaining patch
>  > series" to pick one example.  If you run git rebase -i you're given
>  > a file to edit with the warning that "If you remove a line here
>  > THAT COMMIT WILL BE LOST." Perhaps you need to debunk git's manual
>  > and builtin warnings!
> 
> git already has both the rebase operation and immutable history.  Not
> to mention "the Zeitgeist".  I think it's rather Bazaar that can
> benefit from my input, if it cares to.
> 
> You should also understand that sophisticated git users now all
> understand that it's possible to recover from anything a user can do,
> as long as they don't touch anything under .git/objects.  So those
> warnings refer to the kinds of cuts and bruises you get playing
> basketball with the big boys, not to getting run over by a bus.  The
> shouting is all about reducing FAQs on the mailing list, not
> preventing destruction of valuable data.

Oh, so long as *sophisticated* users don't get confused or encouraged to
misuse the tool, it's all ok?

I welcome your insights.  I'm just not particularly interested in arguments
that rebase doesn't do X or Y for some definition of X and Y that only you
seem to use (it's not the terminology as understood in this community, and
it's not the terminology as demonstrated in git's documentation).  That is
not helping any tool to improve.  You may as well argue that people that
break in to computers should not be called “hackers”.  Sorry, that battle's
been lost.

>  > The sense in which history is not "altered" or "destroyed" by
>  > rebase seems pretty irrelevant to how and why many (most?) people
>  > use rebase.
> 
> Yes and no.  The *intent* is to deprecate the old and exalt the new.
> Nevertheless, we're all human, and occasionally rejoice that not only
> "what was lost is now found!" but that it only took four characters to
> do it ("@{1}").

Yes, easy undo is always a welcome feature.  That's separate issue, though.

>  > >  > Loom does preserve history.  It strictly only adds to the DAG.
>  > > 
>  > > This is true of git as well.  "Branches" in the sense that you think
>  > > of them---as a strong association among a name, a working tree, and a
>  > > linear history of development---*don't exist* in git.  There are
>  > 
>  > I am aware of this.  I'm not sure if you do understand what bzr-loom does.
> 
> No, of course not.  My workflow involves branching on the order of
> every fifteen minutes, and a commit on every save.  No Python-based
> tool can keep up with that, simply executing the interpreter and
> importing a couple of modules implies non-negligible delay.  So it
> would take a big change in workflow, substantial time, and effort to
> try looms in practice.
> 
> I tried to find out what loom does from the documentation, but
> everything I could find (except for a few vague testimonials from
> fans) indicates that it looks a lot like Mercurial queues.  The
> results of a more systematic check are below.

Mercurial queues are less capable because, like rebase, they discard the
original revisions as you make changes to the queue, e.g. when you
“refresh”.

>  > > ahistorical and can't even represent the DAG AFAIK.  git provides no
>  > > association: there's no easy way to find the former referents of
>  > > rebased or otherwise reused refs, and they don't appear in gitk.
>  > 
>  > Looms do not have this flaw.
> 
> Oh?  What's the graphical UI for looms equivalent to gitk?  Is there a

The same UI as for any bzr branch: bzr viz/qlog, depending on your
preference.  It would be neat if qlog of a loom automatically loaded all the
heads in the loom... maybe it already does?

> publicly available moderately complex loom and/or script to produce
> one for me to look at?  jamesh (see below) draws a nice picture but
> it's not a screenshot.  If "none", I guess you're talking about "looms
> don't have the feature so the feature can't be buggy."<wink>

I'm not sure that a screenshot would be particularly illuminating.

>  > You seem to be arguing that they are theoretically equivalent, but
>  > clearly they are not so in practice!
> 
> First off, the HOWTO
> 
> http://bazaar.launchpad.net/~bzr-loom-devs/bzr-loom/trunk/annotate/head:/HOWTO
> 
> makes the straightforward use of a loom look very much like "rebase
> with 12 bzr commands instead of one git command".  With modern gits
> that check for identical changes, you even get the effect described by
> "When upstream have merged a patch" automatically.  (AIUI, anyway, my
> workflows rarely encounter duplicate patches so I can't really say
> from experience.)

Let's not get too bothered by the current UI, as I say the implementation is
still somewhat rough, and there are plenty of warts to be removed and
improvements to be made.

A loom is basically colocated branches + some commands for a particular
workflow + some support for versioning what the collection of heads is.

Say you are working on feature for some project, and you want to track the
trunk while doing so.  Let's say it turns out to have 4 logical parts that
you would like to deliver as a series of changes, rather than as one
monolithic change.  You could do it by making four separate branches, and
when you decide you want to get the latest changes on trunk you can merge
trunk into part-1, then part-1 into part-2, etc.  Looms basically make that
workflow more convenient.  You still have multiple pieces that you can
deliver separately for convenience of review and merging by the recipient,
but no commit is ever discarded, ignored, downgraded, or what ever euphemism
you prefer.

On top of that, because the overall state is versioned, you can share and
collaborate on the loom, even in situations like "I've merged latest trunk
into parts 1-3, but I haven't merged into 4 yet".

If I used git, I'd still want loom available to me.

> Second, there is hardly any practice for looms.  Yeah, I know, Barry
> and you and Robert love them to death.  The result of that love?  One
> trivial example ("here's how you can turn a 'shelve'-'unshelve' pair
[...]
> well with rebase.  Certainly, you can *imagine* from the descriptions
> that looms are more flexible, but AYGNI?  This is a major missed PR
> opportunity if looms are all that you imply they are.

Well, they aren't ready for prime-time.  I think the level of fuss about
them has been about right given the quality of the implementation so far.

>  > >  > The only exception is when you a remove a thread;
[...]
> So, back to the loom.  Where does the thread/patch go when you do
> that?  Is there a way to recover it if you change you mind later?

The same as if you rebase, uncommit, or whatever.  If you haven't
garbage-collected the repository (and given that we don't have a
garbage-collect command yet it's safe to assume you haven't!) then it'll
still be there.  You can retrieve it by the revid.  You can find the revid
with the “bzr heads” plugin, or by looking at your commit emails, or use a
tag if you tagged it, or use Robert's bzr-search plugin, etc.

>  > > I don't understand the point.
>  > 
>  > The following sentence states the point:
>  > 
>  > >  > If you have a change, you have the revision where that change
>  > >  > originated.
> 
> That sentence still has no meaning except "I like looms", because you
> haven't explained what "originated" means or why anybody would care.

It originates with the Big Bang.  Then a while later came the dinosaurs...

Perhaps it's easiest to show what I mean by contrasting with rebase.  In
rebase, if I have some diff A in revision R, then later when I rebase (to
update my work for changes in trunk, for instance) I'll have diff A'
(possibly identical to A) in revision R', and revision R is no longer in the
ancestry of this tip.  It's still the “same” change, but it has a new
identity.

In the loom workflow, you still have R in the ancestry even after updating
my work for changes in trunk.  The original commit, with original message
and original date and signature and annotations are still there.

A benefit of this is that if someone else had branched off R then there's no
unnecessary impediment to merging their work with trunk, which means that in
turn there's less incentive to keep my work-in-progress unpublished just
because it isn't quite ready for submission to trunk.  

>  > The difference I am highlighting here is that no loom operation will lose
>  > that revision metadata, or disassociate it or downgrade it to some sort of
>  > second-class reference that gitk won't notice.  It is a regular part of the
>  > branch's ancestry, like any other merge.  This is different to rebase.
> 
> But what about *before* it's merged?  It just lives in the loom, and
> none of your other tools really know about it, right?

Sure they do.  Looms are just another branch as far as other code is
concerned, so long as I have the plugin installed it looks just like any
other branch to tools like “bzr viz”.  Tools that are using the regular
branch API will see the the tip of the current thread as the tip of the
branch.  I can pass “thread:foo” as a revspec to any command that takes a
revspec.

[...]
>  > > To loom.  It can't refer to rebase, because in rebase there must be a
>  > > commit or you can't refer to it via the VCS.
>  > 
>  > But loom does record it: it's a merge.
> 
> Of what to what?  And I thought merges weren't recorded in bzr, but
> rather you need to do an explicit commit?

Yes.  When you do up-thread from thread A to thread B and there are unmerged
changes from the lower thread, you get a pending merge in the working tree
that you can then commit.  The parents of that commit are (tip(A), tip(B)).
i.e. the result in the history is identical to if A and B were separate,
non-loom branches, and you did “cd path/to/B; bzr merge ../path/to/A; bzr
commit”

> It sounds to me to be suspiciously like a very inconvenient way to
> approximate git's DAG traversal.  I suppose with a different workflow
> it would be convenient, but to the extent I can imagine it such a
> workflow would be just as well-supported by Mercurial queues or StGit.

Unless you want other people to be able to branch off or merge from your
work, even if you later decide to update your work to a new trunk.  In that
situation rewriting branch history pulls the rug out from under them; simply
building on the original history makes that a non-event.

In another mail you say:

> While reading John's reply about previous head as a property and
> thinking how it sounds similar to the git practice of keeping reflogs,
> etc, it occured to me that an important difference in patterns of
> thought here is that git users generally do not think of history as
> being "contained in branches," while you pretty clearly do.

I'm glad you finally noticed! ;)

-Andrew.