Rev 3156: Add new 'LCA merge' merge algorithm (abentley) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Thu Jan 3 07:36:18 GMT 2008

At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 3156
revision-id:pqm at pqm.ubuntu.com-20080103073604-bdu4u15nqv5hlqb4
parent: pqm at pqm.ubuntu.com-20080103031640-m5fqotep9qaw2jtp
parent: aaron.bentley at utoronto.ca-20080103063853-7sv9zq8m87p7gzf7
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2008-01-03 07:36:04 +0000
message:
  Add new 'LCA merge' merge algorithm (abentley)
added:
  doc/developers/lca-merge.txt   lcamerge.txt-20080103061803-9isydn4ivgwrvorw-1
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/merge.py                merge.py-20050513021216-953b65a438527106
  bzrlib/option.py               option.py-20051014052914-661fb36e76e7362f
  bzrlib/tests/blackbox/test_merge.py test_merge.py-20060323225809-9bc0459c19917f41
  bzrlib/tests/test_merge.py     testmerge.py-20050905070950-c1b5aa49ff911024
  bzrlib/tree.py                 tree.py-20050309040759-9d5f2496be663e77
  bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
  doc/developers/index.txt       index.txt-20070508041241-qznziunkg0nffhiw-1
    ------------------------------------------------------------
    revno: 3144.3.7
    revision-id:aaron.bentley at utoronto.ca-20080103063853-7sv9zq8m87p7gzf7
    parent: aaron.bentley at utoronto.ca-20080103063407-yicj1hy2v9s3w39j
    committer: Aaron Bentley <aaron.bentley at utoronto.ca>
    branch nick: lcamerge
    timestamp: Thu 2008-01-03 01:38:53 -0500
    message:
      Update from review
    modified:
      bzrlib/merge.py                merge.py-20050513021216-953b65a438527106
      bzrlib/tests/blackbox/test_merge.py test_merge.py-20060323225809-9bc0459c19917f41
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
    ------------------------------------------------------------
    revno: 3144.3.6
    revision-id:aaron.bentley at utoronto.ca-20080103063407-yicj1hy2v9s3w39j
    parent: abentley at panoramicfeedback.com-20071228215000-7i88ux30pcrwce0d
    committer: Aaron Bentley <aaron.bentley at utoronto.ca>
    branch nick: lcamerge
    timestamp: Thu 2008-01-03 01:34:07 -0500
    message:
      Add documentation of LCA merge
    added:
      doc/developers/lca-merge.txt   lcamerge.txt-20080103061803-9isydn4ivgwrvorw-1
    modified:
      doc/developers/index.txt       index.txt-20070508041241-qznziunkg0nffhiw-1
    ------------------------------------------------------------
    revno: 3144.3.5
    revision-id:abentley at panoramicfeedback.com-20071228215000-7i88ux30pcrwce0d
    parent: abentley at panoramicfeedback.com-20071228214916-9yttbqxmtkn4fxct
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: lcamerge
    timestamp: Fri 2007-12-28 16:50:00 -0500
    message:
      Add NEWS entry
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
    ------------------------------------------------------------
    revno: 3144.3.4
    revision-id:abentley at panoramicfeedback.com-20071228214916-9yttbqxmtkn4fxct
    parent: abentley at panoramicfeedback.com-20071228214518-kbzpqoco171k1jdu
    parent: pqm at pqm.ubuntu.com-20071227150146-08nqv2gvo5e3i1n3
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: lcamerge
    timestamp: Fri 2007-12-28 16:49:16 -0500
    message:
      Merge from bzr.dev
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/commands.py             bzr.py-20050309040720-d10f4714595cf8c3
      bzrlib/diff.py                 diff.py-20050309040759-26944fbbf2ebbf36
      bzrlib/tests/test_diff.py      testdiff.py-20050727164403-d1a3496ebb12e339
    ------------------------------------------------------------
    revno: 3144.3.3
    revision-id:abentley at panoramicfeedback.com-20071228214518-kbzpqoco171k1jdu
    parent: abentley at panoramicfeedback.com-20071228202604-kkolnauo09glnsbh
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: lcamerge
    timestamp: Fri 2007-12-28 16:45:18 -0500
    message:
      Update test for new conflicted types
    modified:
      bzrlib/tests/test_merge.py     testmerge.py-20050905070950-c1b5aa49ff911024
    ------------------------------------------------------------
    revno: 3144.3.2
    revision-id:abentley at panoramicfeedback.com-20071228202604-kkolnauo09glnsbh
    parent: abentley at panoramicfeedback.com-20071228201014-hd5g7gsz1ds1mmr4
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: lcamerge
    timestamp: Fri 2007-12-28 15:26:04 -0500
    message:
      Get conflict handling working
    modified:
      bzrlib/merge.py                merge.py-20050513021216-953b65a438527106
      bzrlib/tests/blackbox/test_merge.py test_merge.py-20060323225809-9bc0459c19917f41
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
    ------------------------------------------------------------
    revno: 3144.3.1
    revision-id:abentley at panoramicfeedback.com-20071228201014-hd5g7gsz1ds1mmr4
    parent: pqm at pqm.ubuntu.com-20071222080058-lra6luc153ex60w4
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: lcamerge
    timestamp: Fri 2007-12-28 15:10:14 -0500
    message:
      Implement LCA merge, with problematic conflict markers
    modified:
      bzrlib/merge.py                merge.py-20050513021216-953b65a438527106
      bzrlib/option.py               option.py-20051014052914-661fb36e76e7362f
      bzrlib/tests/blackbox/test_merge.py test_merge.py-20060323225809-9bc0459c19917f41
      bzrlib/tests/test_merge.py     testmerge.py-20050905070950-c1b5aa49ff911024
      bzrlib/tree.py                 tree.py-20050309040759-9d5f2496be663e77
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
=== added file 'doc/developers/lca-merge.txt'

--- a/doc/developers/lca-merge.txt	1970-01-01 00:00:00 +0000
+++ b/doc/developers/lca-merge.txt	2008-01-03 06:34:07 +0000
@@ -0,0 +1,158 @@
+LCA Merge
+=========
+
+by Aaron Bentley
+
+Essential characteristics
+-------------------------
+
+In the general case (no criss-cross), it is a three-way merge.  When
+there is a criss-cross at the tree level, but not for the particular
+file, it is still a three-way merge.  When there's a file-level
+criss-cross, it's superior to a three-way merge.
+
+Algorithm
+---------
+
+First, we compare the files we are trying to merge, and find the lines
+that differ.  Next, we try to determine why they differ; this is
+essential to the merge operation, because it affects how we resolve the
+differences.  In this merger, there are three possible outcomes:
+
+1. The line was added in this version: "new-this"
+2. The line was deleted in the other version: "killed-other"
+3. The line was preserved as part of merge resolution in this version,
+   but deleted in the other version: "conflicted-this"
+
+Option 3 is new, but I believe it is essential.  When each side has made
+a conflicting merge resolution, we should let the user decide how to
+combine the two resolutions, i.e. we should emit a conflict.  We cannot
+silently drop the line, or silently keep the line, which can happen if
+we choose options 1 or 2.  If we choose options 1 or 2, there's also a
+possibility that a conflict will be produced, but no guarantee.  We need
+a guarantee, which is why we need a new possible outcome.
+
+To decide whether a line is "new-this", "killed-other" or
+"conflicted-this", we compare this version against the versions from
+each "least common ancestor" (LCA), in graph terminology.  For each LCA
+version, if the line is not present in the LCA version, we add it to the
+"new" set.  If the line is present in the LCA version, we add it to the
+"killed" set.
+
+When we are done going through each LCA version, each unique line will
+be in at least one of the sets.  If it is only in the "new" set, it's
+handled as "new-this".  If it is only in the "killed" set, it's handled
+as "killed-other".  If it is in both sets, it's handled as
+"conflicted-this".
+
+The logic here is a bit tricky: first, we know that the line is present
+in some, but not all, LCAs.  We can assume that all LCAs were produced
+by merges of the same sets of revisions.  That means that in those LCAs,
+there were different merge resolutions.  Since THIS and OTHER disagree
+about whether the line is present, those differences have propogated
+into THIS and OTHER.  Therefore, we should declare that the lines are in
+conflict, and let the user handle the issue.
+
+LCA merge and Three-way merge
+-----------------------------
+
+Now, in the common case, there's a single LCA, and LCA merge behaves as
+a three-way merge.  Since there's only one LCA, we cannot get the
+"conflicted-this" outcome, only "new-this" or "killed-other.  Let's look
+at the typical description of three-way merges:
+
++-----+------+-------+------------+
+|THIS | BASE | OTHER | OUT        |
++-----+------+-------+------------+
+|A    | A    | A     | A          |
++-----+------+-------+------------+
+|A    | B    | A     | A          |
++-----+------+-------+------------+
+|A    | B    | B     | A          |
++-----+------+-------+------------+
+|A    | A    | B     | B          |
++-----+------+-------+------------+
+|A    | B    | C     |\*conflict\*|
++-----+------+-------+------------+
+
+Now, let's assume that BASE is a common ancestor, as is typically the
+case.  In fact, for best-case merges, BASE is the sole LCA.
+
+We always pick the version that represents a change from BASE, if there
+is one.  For the AAAA line, there is no change, so the output is
+rightfully BASE/THIS/OTHER.  For ABAA, the THIS and OTHER are changes
+from BASE, and they are the same change so they both win.  (This case is
+sometimes called convergence.)  For ABBA, THIS is a change from BASE, so
+THIS wins.  For AABB, OTHER is a change from BASE, so OTHER wins.  For
+ABC*, THIS and OTHER are both changes to BASE, but they are different
+changes, so they can't both win cleanly.  Instead, we have a conflict.
+
+Now in three-way merging, we typically talk about regions of text.  In
+weave/knit/newness/lca merge, we also have regions.  Each contiguous
+group of "unchanged" lines is a region, and the areas between them are
+also regions.
+
+Let's assign a to THIS and b to OTHER.  "unchanged" regions represent
+the AAAA or ABAA cases; it doesn't matter which, because the outcome is
+the same regardless.  Regions which consist of only "new-a" or
+"killed-a" represent the ABBA case.  Regions which consist of only
+"new-b" or "killed-b" represent the AABB case.  Regions which have
+(new-a or killed-a) AND (new-b or killed-b) are the ABC* case-- both
+sides have made changes, and they are different changes, so a conflict
+must be emitted.
+
+This is what I mean when I say that it is a three-way merge in the
+common case; if there is only one LCA, then it is merely an alternative
+implementation of three-way.  (One that happens to automatically do
+``--reprocess``, ftw).
+
+Why a new name
+--------------
+
+1. It was time.  Although knit / annotate merge and newness merge have
+   tried to emulate the behavior of the original weave merge algorithm,
+   ``--merge-type=weave`` hasn't been based on weaves for a long time.
+2. Behavior differences.  This algorithm should behave like a three-way
+   merge in the common case, while its predecessors did not.  It also has
+   explicit support for handling conflicting merge resolutions, so it
+   should behave better in criss-cross merge scenarios.
+
+Performance
+-----------
+
+Unlike the current "weave" merge implementation, lca merge does not
+perform any whole-history operations.  LCA selection should scale with
+the number of uncommon revisions.  Text comparison time should scale
+mO(n\ :sup:`2`\ ), where m is the number of LCAs, and n is the number of lines
+in the file.  The current weave merge compares each uncommon ancestor,
+potentially several times, so it is >= kO(n\ :sup:`2`\ ), where k is the
+number of uncommon ancestors.  So "lca" should beat "weave" both in history
+analysis time and in text comparison time.
+
+Possible flaws
+==============
+
+1. Inaccurate LCA selection.  Our current LCA algorithm uses
+   ``Graph.heads()``, which is known to be flawed.  It may occasionally give
+   bad results.  This risk is mitigated by the fact that the per-file graphs
+   tend to be simpler than the revision graph.  And since we're already using
+   this LCA algorithm, this is not an additional risk.  I hope that John Meinel
+   will soon have a fixed version of ``Graph.heads`` for us.
+2. False matches.  Weaves have a concept of line identity, but knits and
+   later formats do not.  So a line may appear to be common to two files, when
+   in fact it was introduced separately into each for entirely different
+   reasons.  This risk is the same for three-way merging.  It is mitigated by
+   using Patience sequence matching, which a longest-common-subsequence match.
+
+Acknowledgements
+================
+
+I think this could be a great merge algorithm, and a candidate to make
+our default, but this work would not have been possible without the work
+of others, especially:
+
+- Martin Pool's weave merge and knit/annotate merge algorithms.
+- Bram Cohen's discussions of merge algorithms
+- Andrew Tridgell's dissection of BitKeeper merge
+- Nathaniel Smith's analysis of why criss-cross histories necessarily
+  produce poor three-way merges.

=== modified file 'NEWS'
--- a/NEWS	2008-01-03 02:17:12 +0000
+++ b/NEWS	2008-01-03 07:36:04 +0000
@@ -22,6 +22,9 @@
    * diff '--using' allows an external diff tool to be used for files.
      (Aaron Bentley)
 
+   * New "lca" merge-type for fast everyday merging that also supports
+     criss-cross merges.  (Aaron Bentley)
+
   IMPROVEMENTS:
 
    * ``branch`` and ``checkout`` can now use files from a working tree to

=== modified file 'bzrlib/merge.py'
--- a/bzrlib/merge.py	2007-12-20 04:20:19 +0000
+++ b/bzrlib/merge.py	2008-01-03 06:38:53 +0000
@@ -1041,6 +1041,30 @@
             file_group.append(trans_id)
 
 
+class LCAMerger(WeaveMerger):
+
+    def _merged_lines(self, file_id):
+        """Generate the merged lines.
+        There is no distinction between lines that are meant to contain <<<<<<<
+        and conflicts.
+        """
+        if self.cherrypick:
+            base = self.base_tree
+        else:
+            base = None
+        plan = self.this_tree.plan_file_lca_merge(file_id, self.other_tree,
+                                                  base=base)
+        if 'merge' in debug.debug_flags:
+            plan = list(plan)
+            trans_id = self.tt.trans_id_file_id(file_id)
+            name = self.tt.final_name(trans_id) + '.plan'
+            contents = ('%10s|%s' % l for l in plan)
+            self.tt.new_file(name, self.tt.final_parent(trans_id), contents)
+        textmerge = PlanWeaveMerge(plan, '<<<<<<< TREE\n',
+            '>>>>>>> MERGE-SOURCE\n')
+        return textmerge.merge_lines(self.reprocess)
+
+
 class Diff3Merger(Merge3Merger):
     """Three-way merger using external diff3 for text merging"""
 
@@ -1165,8 +1189,7 @@
             yield "unchanged", text_a
 
 
-class _PlanMerge(object):
-    """Plan an annotate merge using on-the-fly annotation"""
+class _PlanMergeBase(object):
 
     def __init__(self, a_rev, b_rev, vf):
         """Contructor.
@@ -1180,11 +1203,9 @@
         self.lines_a = vf.get_lines(a_rev)
         self.lines_b = vf.get_lines(b_rev)
         self.vf = vf
-        a_ancestry = set(vf.get_ancestry(a_rev, topo_sorted=False))
-        b_ancestry = set(vf.get_ancestry(b_rev, topo_sorted=False))
-        self.uncommon = a_ancestry.symmetric_difference(b_ancestry)
         self._last_lines = None
         self._last_lines_revision_id = None
+        self._cached_matching_blocks = {}
 
     def plan_merge(self):
         """Generate a 'plan' for merging the two revisions.
@@ -1196,29 +1217,34 @@
         VersionedFile.plan_merge
         """
         blocks = self._get_matching_blocks(self.a_rev, self.b_rev)
-        new_a = self._find_new(self.a_rev)
-        new_b = self._find_new(self.b_rev)
+        unique_a, unique_b = self._unique_lines(blocks)
+        new_a, killed_b = self._determine_status(self.a_rev, unique_a)
+        new_b, killed_a = self._determine_status(self.b_rev, unique_b)
+        return self._iter_plan(blocks, new_a, killed_b, new_b, killed_a)
+
+    def _iter_plan(self, blocks, new_a, killed_b, new_b, killed_a):
         last_i = 0
         last_j = 0
-        a_lines = self.vf.get_lines(self.a_rev)
-        b_lines = self.vf.get_lines(self.b_rev)
         for i, j, n in blocks:
-            # determine why lines aren't common
             for a_index in range(last_i, i):
                 if a_index in new_a:
-                    cause = 'new-a'
+                    if a_index in killed_b:
+                        yield 'conflicted-a', self.lines_a[a_index]
+                    else:
+                        yield 'new-a', self.lines_a[a_index]
                 else:
-                    cause = 'killed-b'
-                yield cause, a_lines[a_index]
+                    yield 'killed-b', self.lines_a[a_index]
             for b_index in range(last_j, j):
                 if b_index in new_b:
-                    cause = 'new-b'
+                    if b_index in killed_a:
+                        yield 'conflicted-b', self.lines_b[a_index]
+                    else:
+                        yield 'new-b', self.lines_b[b_index]
                 else:
-                    cause = 'killed-a'
-                yield cause, b_lines[b_index]
+                    yield 'killed-a', self.lines_b[b_index]
             # handle common lines
             for a_index in range(i, i+n):
-                yield 'unchanged', a_lines[a_index]
+                yield 'unchanged', self.lines_a[a_index]
             last_i = i+n
             last_j = j+n
 
@@ -1227,6 +1253,10 @@
 
         See SequenceMatcher.get_matching_blocks
         """
+        cached = self._cached_matching_blocks.get((left_revision,
+                                                   right_revision))
+        if cached is not None:
+            return cached
         if self._last_lines_revision_id == left_revision:
             left_lines = self._last_lines
         else:
@@ -1255,6 +1285,63 @@
             last_j = j + n
         return unique_left, unique_right
 
+    @staticmethod
+    def _subtract_plans(old_plan, new_plan):
+        """Remove changes from new_plan that came from old_plan.
+
+        It is assumed that the difference between the old_plan and new_plan
+        is their choice of 'b' text.
+
+        All lines from new_plan that differ from old_plan are emitted
+        verbatim.  All lines from new_plan that match old_plan but are
+        not about the 'b' revision are emitted verbatim.
+
+        Lines that match and are about the 'b' revision are the lines we
+        don't want, so we convert 'killed-b' -> 'unchanged', and 'new-b'
+        is skipped entirely.
+        """
+        matcher = patiencediff.PatienceSequenceMatcher(None, old_plan,
+                                                       new_plan)
+        last_j = 0
+        for i, j, n in matcher.get_matching_blocks():
+            for jj in range(last_j, j):
+                yield new_plan[jj]
+            for jj in range(j, j+n):
+                plan_line = new_plan[jj]
+                if plan_line[0] == 'new-b':
+                    pass
+                elif plan_line[0] == 'killed-b':
+                    yield 'unchanged', plan_line[1]
+                else:
+                    yield plan_line
+            last_j = j + n
+
+
+class _PlanMerge(_PlanMergeBase):
+    """Plan an annotate merge using on-the-fly annotation"""
+
+    def __init__(self, a_rev, b_rev, vf):
+       _PlanMergeBase.__init__(self, a_rev, b_rev, vf)
+       a_ancestry = set(vf.get_ancestry(a_rev, topo_sorted=False))
+       b_ancestry = set(vf.get_ancestry(b_rev, topo_sorted=False))
+       self.uncommon = a_ancestry.symmetric_difference(b_ancestry)
+
+    def _determine_status(self, revision_id, unique_line_numbers):
+        """Determines the status unique lines versus all lcas.
+
+        Basically, determines why the line is unique to this revision.
+
+        A line may be determined new or killed, but not both.
+
+        :param revision_id: The id of the revision in which the lines are
+            unique
+        :param unique_line_numbers: The line numbers of unique lines.
+        :return a tuple of (new_this, killed_other):
+        """
+        new = self._find_new(revision_id)
+        killed = set(unique_line_numbers).difference(new)
+        return new, killed
+
     def _find_new(self, version_id):
         """Determine which lines are new in the ancestry of this version.
 
@@ -1281,20 +1368,58 @@
                 new.intersection_update(result)
         return new
 
-    @staticmethod
-    def _subtract_plans(old_plan, new_plan):
-        matcher = patiencediff.PatienceSequenceMatcher(None, old_plan,
-                                                       new_plan)
-        last_j = 0
-        for i, j, n in matcher.get_matching_blocks():
-            for jj in range(last_j, j):
-                yield new_plan[jj]
-            for jj in range(j, j+n):
-                plan_line = new_plan[jj]
-                if plan_line[0] == 'new-b':
-                    pass
-                elif plan_line[0] == 'killed-b':
-                    yield 'unchanged', plan_line[1]
-                else:
-                    yield plan_line
-            last_j = j + n
+
+class _PlanLCAMerge(_PlanMergeBase):
+    """
+    This merge algorithm differs from _PlanMerge in that:
+    1. comparisons are done against LCAs only
+    2. cases where a contested line is new versus one LCA but old versus
+       another are marked as conflicts, by emitting the line as conflicted-a
+       or conflicted-b.
+
+    This is faster, and hopefully produces more useful output.
+    """
+
+    def __init__(self, a_rev, b_rev, vf, graph):
+        _PlanMergeBase.__init__(self, a_rev, b_rev, vf)
+        self.lcas = graph.find_lca(a_rev, b_rev)
+        for lca in self.lcas:
+            lca_lines = self.vf.get_lines(lca)
+            matcher = patiencediff.PatienceSequenceMatcher(None, self.lines_a,
+                                                           lca_lines)
+            blocks = list(matcher.get_matching_blocks())
+            self._cached_matching_blocks[(a_rev, lca)] = blocks
+            matcher = patiencediff.PatienceSequenceMatcher(None, self.lines_b,
+                                                           lca_lines)
+            blocks = list(matcher.get_matching_blocks())
+            self._cached_matching_blocks[(b_rev, lca)] = blocks
+
+    def _determine_status(self, revision_id, unique_line_numbers):
+        """Determines the status unique lines versus all lcas.
+
+        Basically, determines why the line is unique to this revision.
+
+        A line may be determined new, killed, or both.
+
+        If a line is determined new, that means it was not present in at least
+        one LCA, and is not present in the other merge revision.
+
+        If a line is determined killed, that means the line was present in
+        at least one LCA.
+
+        If a line is killed and new, this indicates that the two merge
+        revisions contain differing conflict resolutions.
+        :param revision_id: The id of the revision in which the lines are
+            unique
+        :param unique_line_numbers: The line numbers of unique lines.
+        :return a tuple of (new_this, killed_other):
+        """
+        new = set()
+        killed = set()
+        unique_line_numbers = set(unique_line_numbers)
+        for lca in self.lcas:
+            blocks = self._get_matching_blocks(revision_id, lca)
+            unique_vs_lca, _ignored = self._unique_lines(blocks)
+            new.update(unique_line_numbers.intersection(unique_vs_lca))
+            killed.update(unique_line_numbers.difference(unique_vs_lca))
+        return new, killed

=== modified file 'bzrlib/option.py'
--- a/bzrlib/option.py	2007-09-03 01:50:29 +0000
+++ b/bzrlib/option.py	2007-12-28 20:10:14 +0000
@@ -474,6 +474,8 @@
                                    "Merge using external diff3")
 _merge_type_registry.register_lazy('weave', 'bzrlib.merge', 'WeaveMerger',
                                    "Weave-based merge")
+_merge_type_registry.register_lazy('lca', 'bzrlib.merge', 'LCAMerger',
+                                   "LCA-newness merge")
 
 # Declare the standard options
 _standard_option('help', short_name='h',

=== modified file 'bzrlib/tests/blackbox/test_merge.py'
--- a/bzrlib/tests/blackbox/test_merge.py	2007-12-20 04:20:19 +0000
+++ b/bzrlib/tests/blackbox/test_merge.py	2008-01-03 06:38:53 +0000
@@ -70,6 +70,8 @@
         a_tree.revert(backups=False)
         self.run_bzr('merge ../b -r last:1..last:1 --merge-type weave')
         a_tree.revert(backups=False)
+        self.run_bzr('merge ../b -r last:1..last:1 --merge-type lca')
+        a_tree.revert(backups=False)
         self.run_bzr_error(['Show-base is not supported for this merge type'],
                            'merge ../b -r last:1..last:1 --merge-type weave'
                            ' --show-base')
@@ -420,3 +422,30 @@
         other_tree.commit('rev3b')
         self.run_bzr('merge --weave -d this other -r -2..-1')
         self.assertFileEqual('c\na\n', 'this/file')
+
+    def test_lca_merge_criss_cross(self):
+        tree_a = self.make_branch_and_tree('a')
+        self.build_tree_contents([('a/file', 'base-contents\n')])
+        tree_a.add('file')
+        tree_a.commit('', rev_id='rev1')
+        tree_b = tree_a.bzrdir.sprout('b').open_workingtree()
+        self.build_tree_contents([('a/file',
+                                   'base-contents\nthis-contents\n')])
+        tree_a.commit('', rev_id='rev2a')
+        self.build_tree_contents([('b/file',
+                                   'base-contents\nother-contents\n')])
+        tree_b.commit('', rev_id='rev2b')
+        tree_a.merge_from_branch(tree_b.branch)
+        self.build_tree_contents([('a/file',
+                                   'base-contents\nthis-contents\n')])
+        tree_a.set_conflicts(ConflictList())
+        tree_b.merge_from_branch(tree_a.branch)
+        self.build_tree_contents([('b/file',
+                                   'base-contents\nother-contents\n')])
+        tree_b.set_conflicts(ConflictList())
+        tree_a.commit('', rev_id='rev3a')
+        tree_b.commit('', rev_id='rev3b')
+        out, err = self.run_bzr(['merge', '-d', 'a', 'b', '--lca'], retcode=1)
+        self.assertFileEqual('base-contents\n<<<<<<< TREE\nthis-contents\n'
+                             '=======\nother-contents\n>>>>>>> MERGE-SOURCE\n',
+                             'a/file')

=== modified file 'bzrlib/tests/test_merge.py'
--- a/bzrlib/tests/test_merge.py	2007-12-20 02:47:39 +0000
+++ b/bzrlib/tests/test_merge.py	2007-12-28 21:45:18 +0000
@@ -445,11 +445,14 @@
         self.assertEqual(subtracted_plan,
             list(_PlanMerge._subtract_plans(old_plan, new_plan)))
 
-    def test_plan_merge_with_base(self):
+    def setup_merge_with_base(self):
         self.add_version('COMMON', [], 'abc')
         self.add_version('THIS', ['COMMON'], 'abcd')
         self.add_version('BASE', ['COMMON'], 'eabc')
         self.add_version('OTHER', ['BASE'], 'eafb')
+
+    def test_plan_merge_with_base(self):
+        self.setup_merge_with_base()
         plan = self.plan_merge_vf.plan_merge('THIS', 'OTHER', 'BASE')
         self.assertEqual([('unchanged', 'a\n'),
                           ('new-b', 'f\n'),
@@ -457,3 +460,55 @@
                           ('killed-b', 'c\n'),
                           ('new-a', 'd\n')
                          ], list(plan))
+
+    def test_plan_lca_merge(self):
+        self.setup_plan_merge()
+        plan = self.plan_merge_vf.plan_lca_merge('B', 'C')
+        self.assertEqual([
+                          ('new-b', 'f\n'),
+                          ('unchanged', 'a\n'),
+                          ('killed-b', 'c\n'),
+                          ('new-a', 'e\n'),
+                          ('new-a', 'h\n'),
+                          ('killed-a', 'b\n'),
+                          ('unchanged', 'g\n')],
+                         list(plan))
+
+    def test_plan_lca_merge_uncommitted_files(self):
+        self.setup_plan_merge_uncommitted()
+        plan = self.plan_merge_vf.plan_lca_merge('B:', 'C:')
+        self.assertEqual([
+                          ('new-b', 'f\n'),
+                          ('unchanged', 'a\n'),
+                          ('killed-b', 'c\n'),
+                          ('new-a', 'e\n'),
+                          ('new-a', 'h\n'),
+                          ('killed-a', 'b\n'),
+                          ('unchanged', 'g\n')],
+                         list(plan))
+
+    def test_plan_lca_merge_with_base(self):
+        self.setup_merge_with_base()
+        plan = self.plan_merge_vf.plan_lca_merge('THIS', 'OTHER', 'BASE')
+        self.assertEqual([('unchanged', 'a\n'),
+                          ('new-b', 'f\n'),
+                          ('unchanged', 'b\n'),
+                          ('killed-b', 'c\n'),
+                          ('new-a', 'd\n')
+                         ], list(plan))
+
+    def test_plan_lca_merge_with_criss_cross(self):
+        self.add_version('ROOT', [], 'abc')
+        # each side makes a change
+        self.add_version('REV1', ['ROOT'], 'abcd')
+        self.add_version('REV2', ['ROOT'], 'abce')
+        # both sides merge, discarding others' changes
+        self.add_version('LCA1', ['REV1', 'REV2'], 'abcd')
+        self.add_version('LCA2', ['REV1', 'REV2'], 'abce')
+        plan = self.plan_merge_vf.plan_lca_merge('LCA1', 'LCA2')
+        self.assertEqual([('unchanged', 'a\n'),
+                          ('unchanged', 'b\n'),
+                          ('unchanged', 'c\n'),
+                          ('conflicted-a', 'd\n'),
+                          ('conflicted-b', 'e\n'),
+                         ], list(plan))

=== modified file 'bzrlib/tree.py'
--- a/bzrlib/tree.py	2007-12-09 20:58:03 +0000
+++ b/bzrlib/tree.py	2007-12-28 20:10:14 +0000
@@ -291,14 +291,7 @@
         """
         raise NotImplementedError(self.annotate_iter)
 
-    def plan_file_merge(self, file_id, other, base=None):
-        """Generate a merge plan based on annotations.
-
-        If the file contains uncommitted changes in this tree, they will be
-        attributed to the 'current:' pseudo-revision.  If the file contains
-        uncommitted changes in the other tree, they will be assigned to the
-        'other:' pseudo-revision.
-        """
+    def _get_plan_merge_data(self, file_id, other, base):
         from bzrlib import merge, versionedfile
         vf = versionedfile._PlanMergeVersionedFile(file_id)
         last_revision_a = self._get_file_revision(file_id, vf, 'this:')
@@ -307,9 +300,34 @@
             last_revision_base = None
         else:
             last_revision_base = base._get_file_revision(file_id, vf, 'base:')
+        return vf, last_revision_a, last_revision_b, last_revision_base
+
+    def plan_file_merge(self, file_id, other, base=None):
+        """Generate a merge plan based on annotations.
+
+        If the file contains uncommitted changes in this tree, they will be
+        attributed to the 'current:' pseudo-revision.  If the file contains
+        uncommitted changes in the other tree, they will be assigned to the
+        'other:' pseudo-revision.
+        """
+        data = self._get_plan_merge_data(file_id, other, base)
+        vf, last_revision_a, last_revision_b, last_revision_base = data
         return vf.plan_merge(last_revision_a, last_revision_b,
                              last_revision_base)
 
+    def plan_file_lca_merge(self, file_id, other, base=None):
+        """Generate a merge plan based lca-newness.
+
+        If the file contains uncommitted changes in this tree, they will be
+        attributed to the 'current:' pseudo-revision.  If the file contains
+        uncommitted changes in the other tree, they will be assigned to the
+        'other:' pseudo-revision.
+        """
+        data = self._get_plan_merge_data(file_id, other, base)
+        vf, last_revision_a, last_revision_b, last_revision_base = data
+        return vf.plan_lca_merge(last_revision_a, last_revision_b,
+                                 last_revision_base)
+
     def _get_file_revision(self, file_id, vf, tree_revision):
         def file_revision(revision_tree):
             revision_tree.lock_read()

=== modified file 'bzrlib/versionedfile.py'
--- a/bzrlib/versionedfile.py	2007-12-04 03:54:55 +0000
+++ b/bzrlib/versionedfile.py	2008-01-03 06:38:53 +0000
@@ -519,13 +519,21 @@
 
     def plan_merge(self, ver_a, ver_b, base=None):
         """See VersionedFile.plan_merge"""
-        from merge import _PlanMerge
+        from bzrlib.merge import _PlanMerge
         if base is None:
             return _PlanMerge(ver_a, ver_b, self).plan_merge()
         old_plan = list(_PlanMerge(ver_a, base, self).plan_merge())
         new_plan = list(_PlanMerge(ver_a, ver_b, self).plan_merge())
         return _PlanMerge._subtract_plans(old_plan, new_plan)
 
+    def plan_lca_merge(self, ver_a, ver_b, base=None):
+        from bzrlib.merge import _PlanLCAMerge
+        graph = self._get_graph()
+        new_plan = _PlanLCAMerge(ver_a, ver_b, self, graph).plan_merge()
+        if base is None:
+            return new_plan
+        old_plan = _PlanLCAMerge(ver_a, base, self, graph).plan_merge()
+        return _PlanLCAMerge._subtract_plans(list(old_plan), list(new_plan))
 
     def add_lines(self, version_id, parents, lines):
         """See VersionedFile.add_lines
@@ -595,6 +603,18 @@
         else:
             raise errors.RevisionNotPresent(version_id, self._file_id)
 
+    def _get_graph(self):
+        from bzrlib.graph import (
+            DictParentsProvider,
+            Graph,
+            _StackedParentsProvider,
+            )
+        from bzrlib.repofmt.knitrepo import _KnitParentsProvider
+        parent_providers = [DictParentsProvider(self._parents)]
+        for vf in self.fallback_versionedfiles:
+            parent_providers.append(_KnitParentsProvider(vf))
+        return Graph(_StackedParentsProvider(parent_providers))
+
 
 class PlanWeaveMerge(TextMerge):
     """Weave merge that takes a plan as its input.
@@ -653,6 +673,12 @@
             elif state == 'new-b':
                 ch_b = True
                 lines_b.append(line)
+            elif state == 'conflicted-a':
+                ch_b = ch_a = True
+                lines_a.append(line)
+            elif state == 'conflicted-b':
+                ch_b = ch_a = True
+                lines_b.append(line)
             else:
                 assert state in ('irrelevant', 'ghost-a', 'ghost-b', 
                                  'killed-base', 'killed-both'), state

=== modified file 'doc/developers/index.txt'
--- a/doc/developers/index.txt	2007-12-05 18:58:49 +0000
+++ b/doc/developers/index.txt	2008-01-03 06:34:07 +0000
@@ -33,6 +33,8 @@
 
 * `Network protocol <network-protocol.html>`_ |--| Custom network protocol.
 
+* `LCA merge <lca-merge.html>`_ |--| A nice new merge algorithm.
+
 Data formats
 ============