[RFC] Cherry-picking, reordering revisions and 2D versioning

Sun Jan 15 15:19:09 GMT 2006

On Fri, 2006-01-13 at 11:13 +0100, Jan Hudec wrote:
> Well, it was a brain-storming. Yes, it can be simplified.

Actually, I just carefully re-read the whole thread and I think we are
basically talking about the same thing here.

The main distinction I see is that I do not want to introduce
reincarnation, but just pick-merging. Which is anyway the best you can
hope since, in your proposal, cherry-picking involves a review step
where the user can include arbitrary changes. Those arbitrary changes
cannot be limited to conflict resolution and will normally contain new
work, for example fixing typos.

> All equivalent functionality of quilt/stgit/mq can be built without it.
> The logic that I want to sort out is what it takes to allow merging the
> stacks. Ie. when I pull a branch, use stack operations on revisions
> already there, branch from the middle of the new stack and then attempt
> to merge back to the original branch. And someone does similar thing and
> tries to merge back too. It might work with just picked-ancestry, but
> needs to be checked.

Oh my. Stgit does NOT support that?

It shows that I have not carefully checked the prior art before starting
to think about bzr cherrypicks. I just naively assumed that the
oh-so-great stgit would support the kind of operations you describe.

So, yes, my picked ancestry idea is very much designed to remove the
distinction between queues and branches and to support queue publication
and merging.

> > The common case that needs an extension of the current model is indeed
> > patch commutation, and it's just a special case of cherry-picking. It
> > would be sufficient to keep a list of "picked ancestors" that contains
> > only revision ids that are not part of the normal ancestry.
> 
> Well, It would need picked-ancestors for the moved down patch and
> equivalent-revisions for the moved up patch. Consdier revision A and
> revision B based on it. You reorder it, so you get B2[picks=B] and
> A2[parent=B2, equals=B]. That is so that if you have D[parent=A2] and
> another branch has C[parent=B], you can merge from that branch, taking
> A2 as base.

If I understand you correctly you mean the following:

         P ---> A ---> B ----------> C
          \             \
           `---> B2 -----`-> A2 ---> D
                 [B]

        Lines denote ancestry. Names in bracket denote picked-ancestry.

"Taking A2 as the base" is doubly wrong.

      * It assumes 3-way merging, which goes quickly insane in
        real-world cherry-picking situation. Robust merging in the
        presence of cherry-picks requires weave merging.
      * Assuming 3-way merging, the right base is B, because A2 may
        contain additional changes applied during the review steps of B2
        and A2.

> Thinking of it again, if A2[parent=B2,B], then this case would merge
> properly too. The question is whether it would suffice for other
> operations like revision spliting.

Yes, exactly what I mean. Revision splitting is annoying.

There is a worse-is-better way of handling splitting: don't. Consider
that B is is split into B1 and B2.

 P ---> A ---> B
  \      
   `---> B1 ---> B2
         [B]

In that case, picking just B1 would be the same, ancestry-wise as
picking B. That would be enough to address the "Quux bugfix" use case
you describe on PatchReordering, so I'm inclined to just go for it.

I do not have solid proposal for accurate split support, just fuzzy
ideas at the moment. But splitting diffs is just such a PITA that I do
not expect it to be very important.

> Yes, that's clear and makes sense. And should probably work, though the
> details of weave-merge (3-way merge won't be able to deal with this)
> will still be a bit tricky.

I have been repeatedly assured by some of the few that really understand
weave merging (Martin and Aaron) that weaves support cherry-picking by
design. From what I can tell, weave-merge support for cherry-picks, is
kind of just a matter of actually coding it.

What picked-ancestry provides is the ability to make reasonably
meaningful diff3 merging, which I regard as very desirable since it's
more predictable and is generally more user friendly in case of
conflicts. Also, I believe referencing text ids from picked ancestry
would be useful to allow store garbage collection in the presence of
cherry-picked weaves.

> Yes, many revisions will be created this way. For common stbzr
> operation, where only the final stack should be published, they could be
> pruned. If you make a new version of already published stack, they would
> of course stay.

Basically two approaches there: severing ancestry, ghost revisions. But
as I said, it's premature optimization.

> > If you have real world use case that you do not think would be addressed
> > adequately by this model, please share it with me.
> 
> Well, they probably are addressed adequately. Though I am not sure how
> the partial cherry-pick (I only pick some of the changes, because the
> other parts fix code I didn't merge yet, or are screwed or whatever)
> will work out. Maybe it will, but I don't see it.

I hand-waved that the "Quux bugfix" use case is supported by my model.
You are calling my hand-waving. Okay, let's look at it.

At first, you are just coding on Foo, and you have a Quux bugfix.

        P ---> Foo1 ---> Quux

A Bar programmer wants the Quux patch without the Foo goo, so he does a
partial cherrypick.

        P ---> Foo1 ---> Quux
         \
          `--> Bar1 ---> Bar2
                        [Quux]

Note that Bar2 does not contain all the changes in Quux, as the Bar
programmer removed the bits that were associated changes to the Foo
code.

Then comes a qizzy programmer, who has neither Foo1 nor Bar1 on his
branch, and who also needs the Quux bugfix. I think your use case is a
bit bogus because at no point did anybody prepare a clean Quux-fix
branch based on the mainline. Nevertheless, the qizzy programmer now has
a clean Quux-fix patch to cherrypick.

        P ---> Foo1 ---> Quux
        |\
        | `--> Bar1 ---> Bar2
        |               [Quux]
         \
          `--> qyz1 ---> qyz2
                     [Quux, Bar2]

Now, we have no actually used the cherry-picked ancestry yet. So let's
try to find out where plain diff3 merging is going to fall short.

Consider that Foo2 gets merged into mainline first, then the Bar
programmer merges mainline. For the sake of readability I will not show
the mainline branch on the graph: merging a branch that merged Foo2 is
the same problem as merging Foo2 directly.

        P ---> Foo1 ---> Quux ---> Foo2
         \                           \
          `--> Bar1 ---> Bar2 --------`-> Bar3
                        [Quux]

Either with the current ancestry model or with picked-ancestry, the only
meaningful ancestor is going to be P. But for weave merge, we expect
that the Quux changes are already present in the weave in the ancestry
of Bar2 changes, so weave merge can avoid conflicts generated by diff3.
However, these conflicts could be meaningful, because they could be
caused by the removal of changes required by Foo.

Screwed isn't it?

Let's consider the converse, merging Bar into Foo.

        P ---> Foo1 ---> Quux --,-> Foo
         \                     /
          `--> Bar1 -----> Bar2 
                          [Quux]

Again, the only meaningful base for diff3 is P. The conflict situation
is similar to the previous case: some changes made on the Quux bugfix in
Bar2 may conflict using diff3 and be applied without conflict with weave
merge.

You may think that the Foo programmer would review the changes and find
easily about the issue, but in my experience, I never review mainline
merges. If stuff got into mainline, it has already been audited and
tested and it is supposed good.

So, apparently, this scheme is not going to work. That's really
annoying, do you have concrete suggestions about how to improve that?

At the moment we need to tell the Foo branch which changes in the Quux
patch were Foo-specific and need to be preserved when merging branches
that have picked Quux. So we create a Quux-bugfix branch, and merge it
into Foo2.

The diff(Quux, Foo2) should only contain the substantial changes made in
Quux1, like fixing typos an style in the Quux bugfix, but no Foo-stuff
removal or conflict resolution that is apparent in
diff(Quux, patch(P, diff(Foo1, Quux))).

        P ---> Foo1 ---> Quux --,-> Foo2
         \                     /
          `--> Quux1 ---------'
               [Quux]

Now, Bar can merge the Quux-bugfix branch.

        P ---> Foo1 ---> Quux --,-> Foo2
         \                     /
          \-------> Quux1 ----'
           \        [Quux]
            \           \
             `--> Bar1 --`-> Bar2
                             [Quux]

And when merging Foo2 and Bar (in any direction), Quux1 can be used as a
diff3 merge base.

Then the qyzzy programmer finds out he needs the same bugfix. Ideally,
the Quux bugfix would have been merged into the trunk at that point, but
imagine for a moment that reality got in the way and that PQM that
commits to mainline was unusable for a couple of days.

        P ---> Foo1 ---> Quux --,-> Foo2
         \                     /
          \-------> Quux1 ----'
           \        [Quux]
            |            \
            |            |\
            |\           | \
            | `-> Bar1 --|--`-> Bar2
            |            |      [Quux]
             \            \
              `-> qyz1 ----`--> qyz2
                                [Quux]  

As far as the Bar and qyz branches are concerned, the Quux bugfix is
just a common parent. The cherry-picking does not matter to them. 3-way
merging of Bar2 and qyz2 will just use Quux1 as a base instead of P.  

Now, the slightly embarrassing bit for me is that the picked ancestry
makes no difference. You can get this behavior today with bzr. Which is
fine with me because the picked ancestry value lies in other use cases.

Since I see steams is already venting out of the ears of people in the
audience, I'll save the nice diagrams of those use cases for another
time. And anyway I'm starting to be a bit tired.
-- 
                                                            -- ddaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060115/f318ba0e/attachment.pgp