[MERGE] LCA Merge resolution for tree-shape

Aaron Bentley aaron at aaronbentley.com
Thu Sep 11 00:22:44 BST 2008

Hash: SHA1

John Arbash Meinel wrote:
> ...
>>>> This is mind-melting.  Can we please simplify _entries_lca so that it
>>>> emits *everything* except entries that are the same in THIS and OTHER?
>>>> Trying to keep track of when it defers merge resolution to merge_names
>>>> vs resolving the merge itself is not easy.
>>> Well, because it does it correctly, and keeps us from having to deal
>>> with entries that are not genuinely modified. I realize it is complex.
>>> Most of that is real-world complexity derived from examples in the mysql
>>> history. I believe this is one of the explicit points that they ran
>>> into. Their merge case does indeed have a double-criss-cross that causes
>>> things to look modified when they shouldn't. To get the merge correct,
>>> we have to treat those paths and content as unmodified. (Because it
>>> genuinely hasn't given the per-file graph.)
>> Certainly I'm not trying to say we shouldn't address such cases.  I'm
>> trying to say we shouldn't address such cases in _entries_lca.
>>> In order to generate the right values for the various fields (content
>>> changed, filename changed, etc) I've done 90% of the work of figure out
>>> whether or not we need to do any more work. Why not just omit the ones
>>> that I know don't need to be processed further?
>> This change puts a big dent in our separation of concerns.  I don't have
>> a big investment in any particular separation of concerns, I just want
>> it to be clear what they are.
> ...
>>> 4) Changing _entries_lca to return everything where THIS != OTHER. The
>>> point of what I wrote is to only return entries where OTHER != "base"
>>> (where base is also made up of lcas as appropriate.)
>> 1, 2, and 4 are the only substantial concerns I had, so not addressing
>> them gets you another
> So I think a thread just discussing this is important to have.
> Your specific request is to return everything when "THIS != OTHER". I
> would actually argue that the correct api is to return when OTHER !=
> BASE. We don't really care what has happened in THIS when merging
> *except* when it happens to conflict with what happened in OTHER.

In a three-way merge, things only get interesting when OTHER != THIS and
OTHER != BASE.  If OTHER == BASE, THIS supersedes it (or you have no
change).  If OTHER == THIS, either you have no change, or you have

But we're not talking about a three-way merge, and with all these LCAs
and things, I thought that a THIS != OTHER test would be simpler.

Of course, we can just emit everything, and let the actual merge code
sort things out.  That's what we should do, unless we can avoid doing
some work with with a THIS != OTHER or BASE != OTHER check.  So is
filtering in entries_lca a win at all?

It *used* to be a big win because it allowed us to avoid hitting disk.
But with the caching of SHA1s we have these days, I doubt that's a big
win.  We'll have already made sure there are no changes before starting
to merge.

And if using these shortcuts is still a win, I bet we'd have a shorter,
clearer method if we did ALL the merging in *entries_lca, instead of
handling some of it in _merge_names or _merge_executable.

> For example, merging C into D:
>   A	# create foo and bar
>   |\
>   B C	# B modifies foo, C modifies bar
>   :
>   D	# D deletes foo
> At this point, the delta A => C is only modifying 'bar'. I don't see why
> we should care about what happened to 'foo'. If we return everything
> where "THIS" != "OTHER" then we just have to check whether anything
> really happened again, and filter out the cases where OTHER == BASE.

But similarly:

A # A: foo = bar
B C # B: foo = quxx
      C: foo = quxx

If we merge B into C, we shouldn't care about what happened to foo, even
though OTHER != BASE.

> Now, perhaps you did mean "return everything where OTHER != BASE".

I didn't.

> In
> which case, I'm doing that, with the added complexity that OTHER may
> actually == LCA, rather than BASE. (And often, OTHER == PER_FILE_BASE.)
> And in my opinion, the same thought holds. OTHER didn't actually change
> anything, so we don't need to evaluate that file further.

The problem is that it's quite a lot of added complexity.  entries_lca
seems to most of what merge_names et al are supposed to be doing.

Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the bazaar mailing list