[RFC] Cherry-picking, reordering revisions and 2D versioning

Fri Jan 6 16:16:16 GMT 2006

I want to say ahead, that what I was thinking of certainly was not meant
as a simple tool. It was meant as a quite complex extension of the
deepest logic -- and as such of course as a long-term plan. It would add
another kind of relationship between revisions. And it is really just
vague suggestion -- the logic of merging is not really thought out yet.

On Fri, Jan 06, 2006 at 08:05:21 -0600, John Arbash Meinel wrote:
> Jan Hudec wrote:
> > Hello everyone,
> > 
> > Thinking about problems with merging and bad use of versioning tool (a bad one
> > I have to say) I had at $work yesterday I came up with a solution that I would
> > like to see. I already talked about implementing 'bzr queues' this way. Now I
> > extend it by removing the disctiction between a branch and a queue. This
> > is a draft description of it. The following text I have also put on
> > http://wiki.bazaar.canonical.com/PatchReordering.
> > 
> > Thinking about how Arch, Darcs and Quilt approach cherry-picking, I came up
> > with an interesting idea how to implement it.
> > 
> > Consider a simple cherry-picking situation. Let's suppose following graph:
> > 
> >   A
> >  / \
> > B   E
> > |
> > C
> > |
> > D
> > 
> > where you want to cherry-pick C onto E. If it were in Darcs, or if the B,C,D
> > branch was a Quilt stack, you would first reorder the patches:
> > 
> >   A
> >  / \
> > C2  E
> > |
> > B2
> > |
> > D
> > 
> > Now C2 and E have common base, so you can merge them.
> 
> Well, if you change that graph to:
>   A
>  / \
> C2  E
> |
> B2
> |
> D2
> 
> You haven't violated any bzr constraints. Specifically, you have a whole
> new set of revisions whose content changes look a lot like the old
> revisions, but have different lineage.

That's a question. I did specifically want to keep D untouched. But yes,
it was drawn incorrectly. It should have rather been something like:

   .--A
  /  / \
 B  C2  E
 |  |
 C..B2
 |
 D

Where the .. marks a 'reincarnation' relation. It means that the
revisions have equal texts and should be considered equal when searching
for a common ancestor.

Actually, there would be _two_ such relations. Reincarnation of a text
and reincarnation of a diff.

The text reincarnation should actually mean the weave would record which
lines of C2 come from C.

> If you wanted to, the tool could do something like:
> 
>    - A
>  /  / \
> B  C2  E
> |  |
> C  B2
> |  |
> D  D2
> \ /
>  F
> 
> > 
> > So why shouldn't you do exactly the same thing with normal revisions? I believe
> > it should be possible. The reordering should IMHO work like it does in Quilt.
> > That is, take the patch(A, diff(B, C)), let the user review, create C2 and then
> > create B2 as textually equivalent to C (so the diff would be the one of B if
> > they were independent). I consider the step of letting the user review as
> > important.
> 
> The reason you need to create new revisions is because of a simple
> constraint that a revision id uniquely defines the contents and
> ancestry. So if you change the contents or the ancestry, it has to be a
> different revision.
> That is how we make sure that you and I are talking about the same
> thing, so when I say "I have revid:foo in my ancestry" you know exactly
> what text it had.

Yes. I would be creating new revisions. But they would need to be
related to the old ones and would need to be related to them in a way
that is different from normal ancestry relationship.

> > There are some points to be solved about how to record the new revisions
> > replace the old ones and how to treat various merges involving different
> > revision versions. Quilt, nor stgit or mq face these problems because they
> > don't push the stacks around. But if we allowed stack-like manipulation on
> > normal revisions, then these issues would arrise.
> > 
> > Since the review step already means the C2 does not necessarily have exactly
> > the same diff as C, there is no reason not to support the other quilt
> > operations like splitting and joining revisions.
> 
> I agree. It is simply a tool which automates the creation of new
> revisions based on others. New revisions can be merged, split, whatever
> you like.

But if I want to keep the relations between them, I am not that free.

> > Implements cherry-picking. Unifies classical versioning with quilt-style
> > workflow in a flexible way while keeping all the record of what has happened.
> > Over simple cherry-picking implemented eg. in Arch I see at least following
> > advantages:
> > 
> >     * It plays better with normal merge, so you can continue to use that
> >       afterwards. Though I suppose weave-merge would do well enough to
> >       avoid spurious conflicts here.
> >     * It allows cherry-picking only part of a changeset and keeping
> >       track of what was done.
> >     * It allows cleaning up your changesets after you already commited
> >       them.
> 
> It would allow easily creating new changesets after you have committed
> others. There is an important distinction here. Because if someone else
> merged B & C, they would have to re-merge B2 & C2, which might cause
> conflicts, because it will look like the same area was changed twice.

Which would defeat the whole purpose of this circus. The key point here
is, that though the new revisions are certainly different, they replace
the old ones in a way understood by bzr.

> >     * It allows cleaning someone elses changests (usually because you
> >       need to cherry-pick part of a badly done changeset) without
> >       loosing any history.
> > 
> > Unresolved Issues:
> > 
> >     * Weaves could probably be used very efficiently to generate the
> >       texts of reordered revisions. Simply use the modified ancestry of
> >       the revision to generate the text. This would not produce rejects
> >       the way inexact patching does. But if the revisions touch the same
> >       parts of a file, the result is likely to be semantically wrong and
> >       needs user attention drawn to it. So the question here is what to
> >       consider a conflict and how to present it to the user.
> >     * It is not yet very clear how exactly the relationship between
> >       versions of a revision should be recorded.
> >     * Quilt does not version the queue nor is it intended to push the
> >       queue around. Thus it does not have to face questions we have to
> >       answer if new version (incarnation?) of a revision that already
> >       spread around the world can be created. These include:
> > 	  o How do we merge newer and older version(incarnation?) of a
> > 	    revision? The mapping is not necessarily 1-on-1 if revisions
> > 	    can be split and combined. That of course includes merging
> > 	    descendants of such.
> > 	  o How do we present ancestry in a branch containing two
> > 	    independent new versions(reincarnations?) of a revision?
> > 	  o How to we merge such independent new versions of a revision?
> > 	    This might happen if two branches modify the revision and then
> > 	    fetch from each other. This again includes merging descendants
> > 	    of such revision versions.
> >       The solution might involve creating a 'rebased' new versions of
> >       some revisions. That means we create new versions of the revisions
> >       that depend on the newest version of their predecessors.
> > 
> 
> Well, the final possibility would be to do:
>    A
>  / | \
> B  |  E
> |  |
> C  |
> | \|
> D  C2
> |  |
> \  B2
>  \ |
>    D2
> 
> When E merges C2, it gets B & C, so it won't try to merge them later.
> But it could cause problems if another person pulled plain B, and then
> tried to merge the child of E, because to them it looks like the child
> already has C, so it won't try to pull it.

No. C2 is not a descendant of C. C2 is a 'reincarnation' of C. The merge
is required to be a weave merge.

Thinking of it, my suggestion, as it stands, does not implement
cherry-picking, but requires it to be implemented beforehand. Then C2 is
a new revision with parent A which cherrypicks C and B2 is a new
revision with parent C2 which cherrypicks B. And additionaly being
marked as exact copy of C if the merge had (I believe it would) have a
use for that information.

> I think we can probably do something, but we have to be careful because
> of the "revision-id => uniquely maps to a set of texts and ancestry"
> model that we use.

Yes. That should to continue to be true, but I want to add things to it.

-- 
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060106/11e5a974/attachment.pgp