Optimal Merge Base selection

Mon Jul 11 03:19:30 BST 2005

Martin Pool wrote:
> On 10 Jul 2005, John Arbash Meinel <john at arbash-meinel.com> wrote:
>
>
>>This change would mean trees with no real common ancestry, which only
>>merged from a common group would not find their commonality. I mean
>>something like this:
>>
>>Me:    A - B - C - D - E - F
>>#            /           /
>>Other: G - H           /
>>#            \       /
>>You:   J - K - L - M
>>
>>In this case, it would not find H as a merge base, since it is not along
>>the main line of development for either tree.
>>Though how 'H' got merged in, considering it doesn't share history, is
>>left as an exercise for the user.
>
>
> H will get merged in as a two-way merge; that is to say by using the
> empty tree as a base.  I did this the other day to bring in the weave.py
> file, which started in a different directory.
>
> This case works best when there is no overlap in the file names.  (It
> might be nice if there was a way to merge the root of OTHER into a
> subdirectory of THIS.)

Couldn't this be handled with my TREE_ROOT gets a real id suggestion?

>
> If any paths do collide they're likely to have different file-ids; in
> the short term we need to choose one; in the long term it would be nice
> to mark the file-ids as merged.
>
>
>>Say somehow Me and You both merged a library (Other) into our main
>>program. And then You update the library, and I want to merge your
>>changes. In arch this could be done with a "baz replay M", which would
>>even ignore any changes in M that did not effect shared code. I don't
>>really know what "baz merge" would do. And now that it is after 1am, I
>>can't think what it should do.
>
>
> Does replay just ignore any patches that touch files not present in the
> destination?   I suppose we could support such a behaviour as an option
> to merge: don't update files that don't already exist, and disregard
> files with the same name but different ids.

Well, I believe everything in arch is done by id. So that if the ids are
not the same, it is not the same file, regardless of the path. (The one
exception that I know of is the path that the file will be put in, but
that was declared a bug).

It actually puts the patches into a directory named something like
"+patches-missing-files.23123"
So it sort-of ignores them. I think it should print a warning, and put
the patch somewhere, but not actually stop the merge/replay.

>
> In arch, baz and bzr this will mean we record that replayed version as
> merged, but in fact not all of the changes have been taken in.  That
> might cause trouble later.

How so? How is that different from doing a merge, and then editing out
the parts that you don't like?

I realize that bzr makes a different definition of what merging a
revision means, but it seems like at some point merges need to be edited
to get them to fit into the local tree. So having 'missing' is not
really different from having 'edited'.

>
> I suppose this is a limitation of the basic assumption that merges are
> tracked per-tree not per-file.  That could perhaps be relaxed, but it
> seems to make things rather more complex.
>

Well, Aaron defined that the way to do the merge is "bzr merge M L",
basically, take the difference between the M snapshot and the L
snapshot, and merge that into my local tree.

For starters, I don't have any problem with bzr not handling
cherry-picking. As long as I *can* cherry-pick, it doesn't have to mess
up merging. Bzr works in snapshots, not differences anyway, so probably
a lot of cherry-picks will show up as a similar diff.
Now if bzr incorporates weave/codeville merging, then you need to be a
little bit more careful about cherry-picking. Because you start
detecting that a future diff over-rules the previous diff, so you need
to make sure that you know where the original came from, to see if any
new ones over-rule it.

So a different track... How does Codeville handle branching ancestry? If
you annotate each line with a number indicating which revision it was
modified, you can easily see that 10 > 9 thus 10 should take priority.
But with a truly distributed setup, don't you have the problem that it
isn't obvious whether john at arbash-meinel.com-200513123123-aontehuntaho
comes before or after mdp at sourcefrog.net-20051423423-aoehunnth?

Or is it that when you merge my changes, they get re-labeled with the
revision number where they were merged, and you just don't worry that
they came from me. But then if you merge from me again, how do you
detect that the change you made on line 10 supersedes my change on line 10?

I might be way to tangential and lost in my own thoughts. I can't say
that I have spent a long time thinking about Codeville merging, other
than the cursory, "looks kind of interesting".

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050710/5f443a02/attachment.pgp