[MERGE] new merge-type=lca

Alexander Belchenko bialix at ukr.net
Sat Dec 29 14:29:48 GMT 2007


Aaron Bentley пишет:
> When we are done going through each LCA version, each unique line will
> be in at least one of the sets.

You wrote "unique line". How you decide to determine line uniqueness and 
ensure that it's the same line in BASE/THIS/OTHER.

Or you don't mean it will be lines in the text sense?

> Now in three-way merging, we typically talk about regions of text.  In
> weave/knit/newness/lca merge, we also have regions.  Each contiguous
> group of "unchanged" lines is a region, and the areas between them are
> also regions.

So, LCA will use regions as well?

> Performance:
> Unlike the current weave merge implementation, lca merge does not
> perform any whole-history operations.  LCA selection should scale with
> the number of uncommon revisions.  Text comparison time should scale
> mO(n^2), where m is the number of LCAs, and m is the number of lines in
> the file.  

I assume here, n -- is the number of lines.

> The current weave merge compares each uncommon ancestor,
> potentially several times, so it is >= kO(n^2), where k is the number of
> uncommon ancestors.  So "lca" should beat "weave" both in history
> analysis time and in text comparison time.

> Possible flaws:
> 1. Inaccurate LCA selection.  Our current LCA algorithm uses
> Graph.heads(), which is known to be flawed.  It may occasionally give
> bad results.  This risk is mitigated by the fact that the per-file
> graphs tend to be simpler than the revision graph.  And since we're
> already using this LCA algorithm, this is not an additional risk.  I
> hope that John Meinel will soon have a fixed version of Graph.heads for us.
> 2. False matches.  Weaves have a concept of line identity, but knits and
> later formats do not.

I did not know about line identity in weaves. Why this concept was 
dropeed? Performance reasons?

>  So a line may appear to be common to two files,
> when in fact it was introduced separately into each for entirely
> different reasons.  This risk is the same for three-way merging.  It is
> mitigated by using Patience sequence matching, which a
> longest-common-subsequence match.

> 
> I think this could be a great merge algorithm, and a candidate to make
> our default, but this work would not have been possible without the work
> of others, especially:
> - Martin's weave merge and knit/annotate merge algorithms.
> - Bram Cohen's discussions of merge algorithms
> - Andrew Tridgell's dissection of BitKeeper merge
> - Nathaniel Smith's analysis of why criss-cross histories necessarily
>   produce poor three-way merges.
> 
> Aaron

Looks very exciting.

Alexander

PS: One side question from non-english man.
You wrote: "I'm very excited about it, so grab a coffee, and I'll yabber 
on." What "yabber" means?




More information about the bazaar mailing list