Please check my thinking on bug 646979
John Arbash Meinel
john at arbash-meinel.com
Mon Oct 4 23:01:01 BST 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 9/29/2010 11:23 PM, James Westby wrote:
...
> Here's why: (apologies to anyone using screen readers or variable width
> fonts)
>
> ---B---F
> / / /
> / / .-D
> \ A-=
> \ `-E
> \ \
> C------G
>
> (Time passes as you go right)
>
> Here A is an "upstream" revision that is an import of a tarball. B and
> packaging revision based on that, and C is another packaging revision in
> turn based on that. (Think B==Debian and C==Ubuntu)
> Then D and E are independent packagings of two different upstream
> release (say Debian jumped to 3.0, and Ubuntu took the point release
> 2.1). F and G are the new packaging revisions based on merging the old
> ones with these new upstreams.
Drawn vertically
A
/|\
E D B
\ \|\
\ F C
\ |
\ |
\|
G
>
> Now if we simply merge all F->G, we try and merge (D, F) in to (C, E,
> G), which means we are merging changes from A->D with changes from A->E,
> or in other words we are merging two upstream versions together.
>
If you just merge F => G, then the common ancestor is B, so it will
merge (F - B) into (G - C) which should pretty much be D.
But yes, it will be mixing the upstream versions together.
> Consider the case of a file with the version number in it. The BASE
> would be "2.0", one side you change it to "2.1" and the other to "3.0",
> but in reality they were sequential changes, we just can't represent
> that here.
>
> As a packager you don't care about this (generally), and you assume that
> whatever in the latest is fine (you don't try and track at this point
> patches in 2.1 that didn't make it in to 3.0, and even if you wanted to
> this operation wouldn't just show you that)
>
> What merge package does is first merge the two upstream revisions
> together, taking the tree from whichever has the highest version number.
>
> ---B---F
> / / /
> / / .-D--.
> \ A-= H
> \ `-E-`
> \ \
> C------G
>
> Currently it will then just merge H in to G (the target). This can
> generate conflicts, which are very, very confusing to users, as it's
> incredibly hard to explain why they are getting them.
>
Does merging D & E generate conflicts itself? It would seem that if
merging to G generates conflicts, then you should have gotten a conflict
in the intermediate stage as well. (offhand the best you can usually
hope for is more understandable conflicts, unless you have a real
'criss-cross' merge and we are selecting a very poor base.)
Again, redrawing your graph vertically so that I can see it as I'm used to:
A
/|\
E D B
|\|\|\
| H F C
\ |
'-. |
\|
G
Having drawn that, I'm 75% sure that there is no way to merge H => G
that doesn't involve crossing an existing line. The common ancestor is
only E, though. So we probably wouldn't detect it as a criss-cross.
> ---B---F
> / / /
> / / .-D--.
> \ A-= H
> \ `-E-` \
> \ \ \
> C------G---I
>
> Once we have that merged revision we can merge F->I, which will merge
> what we want.
>
> I was thinking about this the other day and realised that uncoditionally
> merging H to the target might not be the right thing to do. I think that
> it should merge it in to the side that had the highest version. That
> should never generate conflicts, as the revision that is being merged
> has no changes against the LCA. I think it should generate an
> equivalent merge though.
>
> So, back to the graphs, if we this time consider D to be newer, but
> still want to merge F->G, would it be ok if we created I on the other
> side:
>
> ---B---F----I
> / / / /
> / / .-D--. /
> \ A-= H
> \ `-E-`
> \ \
> C------G
>
> and then merged I->G for our final revision?
A
/|\
E D B
|\|\|\
| H F C
\ \| |
\ I |
\ |
\ |
\|
G
Now you have a genuine criss-cross. As the lcas are E and B (ancestors
of both I and G that are not superseded by a more recent ancestor.)
Just using 3-way merge (vs say --weave) I would expect this to conflict
more than merging H => G, because of our specific base selection (when
we find a criss-cross 3-way goes to the next base, which will be A,
which then will try to merge (I-A) into (G-A).
>
> I think that it should be "safe" and remove those "odd" conflicts you
> get in the intermediate state, instead moving them to the final merge
> when they should hopefully make sense.
>
> Can anyone generate a scenario where this would give a worse outcode?
> Can anyone in fact generate a scenario where either strategy is the
> wrong thing to do?
>
> The more I think about it the more I am confident in the change, but it
> certainly doesn't seem like an obviously correct change to me.
>
> Thanks,
>
> James
>
My quick analysis says the opposite. The default merge code will give
you worse results introducing the artifical "I child of F and H".
Practice matters more than theory, though.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAkyqTp0ACgkQJdeBCYSNAAMXYACgiCjKQlo7iX8EPqPOTdpAKUpZ
aPYAoIa8hThmw8jGR2I5hch4XGu5Ykku
=/v/+
-----END PGP SIGNATURE-----
More information about the ubuntu-distributed-devel
mailing list