[MERGE] new merge-type=lca
Andrew Bennetts
andrew at canonical.com
Thu Jan 3 04:41:11 GMT 2008
Aaron Bentley wrote:
> Hi all,
>
> This patch implements support for the LCA merge type, which I believe is
> a great general-purpose merge algorithm, with excellent criss-cross
> support. I'm very excited about it, so grab a coffee, and I'll yabber on.
>
> In the general case (no criss-cross), it is a three-way merge. When
> there is a criss-cross at the tree level, but not for the particular
> file, it is still a three-way merge. When there's a file-level
> criss-cross, it's superior to a three-way merge.
>
> Here's how it works:
[...]
This is a very good description. It'd be great to preserve it for interested
developers in doc/developers somewhere.
I wonder if this merge is good enough that we don't need to warn users that
we've detected a criss-cross when merging?
Anyway, I'm pretty happy with this code. The changes are generally pretty easy
to follow.
bb:tweak
Comments inline:
> === modified file 'bzrlib/merge.py'
> --- bzrlib/merge.py 2007-12-20 04:20:19 +0000
> +++ bzrlib/merge.py 2007-12-28 20:26:04 +0000
> @@ -1041,6 +1041,30 @@
> file_group.append(trans_id)
>
>
> +class LCAMerger(WeaveMerger):
> +
> + def _merged_lines(self, file_id):
> + """Generate the merged lines.
> + There is no distinction between lines that are meant to contain <<<<<<<
> + and conflicts.
> + """
> + if self.cherrypick:
> + base = self.base_tree
> + else:
> + base = None
> + plan = self.this_tree.plan_file_lca_merge(file_id, self.other_tree,
> + base=base)
> + if 'merge' in debug.debug_flags:
> + plan = list(plan)
> + trans_id = self.tt.trans_id_file_id(file_id)
> + name = self.tt.final_name(trans_id) + '.plan'
> + contents = ('%10s|%s' % l for l in plan)
> + self.tt.new_file(name, self.tt.final_parent(trans_id), contents)
> + textmerge = PlanWeaveMerge(plan, '<<<<<<< TREE\n',
> + '>>>>>>> MERGE-SOURCE\n')
> + return textmerge.merge_lines(self.reprocess)
It's a shame you need to duplicate so many lines here to change one function
call. I guess it's not worth adding complexity to do something about it.
> + @staticmethod
> + def _subtract_plans(old_plan, new_plan):
> + matcher = patiencediff.PatienceSequenceMatcher(None, old_plan,
> + new_plan)
> + last_j = 0
> + for i, j, n in matcher.get_matching_blocks():
> + for jj in range(last_j, j):
> + yield new_plan[jj]
> + for jj in range(j, j+n):
> + plan_line = new_plan[jj]
> + if plan_line[0] == 'new-b':
> + pass
> + elif plan_line[0] == 'killed-b':
> + yield 'unchanged', plan_line[1]
> + else:
> + yield plan_line
> + last_j = j + n
I see this function is just moved, rather than added, but I find it very hard to
understand. It's not obvious to me why subtracting one plan from another should
produce this particular transformation.
It's tangential to your change, so don't feel that you need to do this to get my
approval for lca-merge, but some explanatory comments for this functional would
be extremely helpful.
> +class _PlanMerge(_PlanMergeBase):
> + """Plan an annotate merge using on-the-fly annotation"""
> +
> + def __init__(self, a_rev, b_rev, vf):
> + _PlanMergeBase.__init__(self, a_rev, b_rev, vf)
> + a_ancestry = set(vf.get_ancestry(a_rev, topo_sorted=False))
> + b_ancestry = set(vf.get_ancestry(b_rev, topo_sorted=False))
> + self.uncommon = a_ancestry.symmetric_difference(b_ancestry)
> +
> + def _determine_status(self, revision_id, unique_lines):
> + """Determines the status unique lines versus all lcas.
> +
> + Basically, determines why the line is unique to this revision.
> +
> + A line may be determined new or killed, but not both.
> +
> + :return a tuple of (new_this, killed_other):
> + """
> + new = self._find_new(revision_id)
> + killed = set(unique_lines).difference(new)
> + return new, killed
It'd help to say something like “:param unique_lines: a collection of line
numbers”. Or perhaps just rename it to “unique_line_numbers”. When I see a
variable called “foo_line”, I tend to expect it to be the contents of the line,
not a line number.
> +
> +class _PlanLCAMerge(_PlanMergeBase):
> + """
> + This implementation differs from _PlanMerge.plan_merge in that:
> + 1. comparisons are done against LCAs only
> + 2. cases where a contested line is new versus one LCA but old versus
> + another are marked as conflicts, by emitting the line as both new-a and
> + killed-b (or new-b, killed-a).
> +
> + This is faster, and hopefully produces more useful output.
> + """
It's odd to describe a class' implementation as differing from a method.
Did you mean to say “This implementation differs from _PlanMergeBase in that:
...”?
> + def _determine_status(self, revision_id, unique_lines):
> + """Determines the status unique lines versus all lcas.
> +
> + Basically, determines why the line is unique to this revision.
> +
> + A line may be determined new, killed, or both.
> +
> + If a line is determined new, that means it was not present at least one
> + LCA, and is not present in the other merge revision.
Typo: “... *in* at least one ...”
> === modified file 'bzrlib/tests/blackbox/test_merge.py'
[...]
> +
> + def test_lca_merge_criss_cross(self):
> + tree_a = self.make_branch_and_tree('a')
> + self.build_tree_contents([('a/file', 'base-contents\n')])
> + tree_a.add('file')
> + tree_a.commit('', rev_id='rev1')
> + tree_b = tree_a.bzrdir.sprout('b').open_workingtree()
> + self.build_tree_contents([('a/file',
> + 'base-contents\nthis-contents\n')])
> + tree_a.commit('', rev_id='rev2a')
> + self.build_tree_contents([('b/file',
> + 'base-contents\nother-contents\n')])
> + tree_b.commit('', rev_id='rev2b')
> + self.build_tree_contents([('a/file',
> + 'base-contents\nthis-contents\n')])
Isn't this call to build_tree_contents redundant?
> === modified file 'bzrlib/versionedfile.py'
[...]
>
> + def plan_lca_merge(self, ver_a, ver_b, base=None):
> + from merge import _PlanLCAMerge
Relative imports are fragile. Always use “from bzrlib.foo ...”
[...]
>
> + def _get_graph(self):
> + from graph import DictParentsProvider, Graph, _StackedParentsProvider
> + from repofmt.knitrepo import _KnitParentsProvider
Again, don't use relative imports.
-Andrew.
More information about the bazaar
mailing list