[MERGE][bug #172657] Graph.find_differences() and status after merge

Fri May 2 07:27:21 BST 2008

John Arbash Meinel wrote:
> Ian Clatworthy wrote:
> | John Arbash Meinel wrote:
> |> Attached is an updated form of my previous submission, which I now
> |> consider at least ready for review.
> 
> Attached is the "final form" of the patch. With the updates to
> find_differences,
> and a new find_unique_ancestors function (which status uses). I believe
> I've
> addressed all of your concerns here. There is one test I'm still working
> on, to
> trigger a particular race condition I fixed. But the code itself should be
> correct, so I would rather submit it, and get it reviewed/merged rather
> than
> waiting on that to happen.

I'm fine with that. I'm still digesting the algorithms in graph.py but I
wasn't planning to critique those in this review anyhow. FWIW, I'm
confident that the new APIs are clean, the code implementation is good
quality and that the tests show things are working as documented. Must
say though that reading parts of graph.py makes my head hurt. :-)

Here are some performance figures (best of two runs) after merging this
into bzr.dev:

* time ../bzr.dev/bzr status = 4.10 sec
* time ../bzr.dev/bzr status --no-pending 0.42 sec
* time ./bzr status = 2.59 sec
* time ./bzr status --no-pending 0.42 sec

So it's much better but we're still paying a high cost for finding,
sorting and presenting all of the detailed revision information we do,
even with your improved algorithm. Maybe we ought to cache the full
pending merge details in the working tree so we only calculate it once?
Or perhaps we should just present the merge tips and only show the rest
in verbose mode? Regardless, let's land this improvement.

bb:tweak

> +    :return: An the iterator from MergeSorter.iter_topo_order()

s/An the//

>  def show_pending_merges(new, to_file, short=False):
>      """Write out a display of pending merges in a working tree."""
> +    # we need one extra space for terminals that wrap on last char
> +    term_width = osutils.terminal_width() - 1
> +    if short:
> +        first_prefix = 'P   '
> +        sub_prefix = 'P.   '
> +    else:
> +        first_prefix = '  '
> +        sub_prefix = '    '
> +
>      parents = new.get_parent_ids()
>      if len(parents) < 2:
>          return

These last 3 lines (check for the len(parents)) ought to be at the very
top of the routine.

> +    ignore = set([None, last_revision, _mod_revision.NULL_REVISION])
> +    graph = branch.repository.get_graph()
> +    other_revisions = [last_revision]

Do you still need ignore now that you've introduced other_revisions?
ignore is still checked during the inner loop test but I couldn't see it
, unlike other_revisions, being updated anywhere.

> +complex_shortcut2 = {'a':[NULL_REVISION], 'b':['a'], 'c':['b'], 'd':['c'],
> +                    'e':['d'], 'f':['e'],
> +                    'g':['f'], 'h':['d'], 'k':['h', 'i'], 'j':['h'],
> +                    'i':['g'], 'l':['k'], 'm':['l'],
> +                    'n':['m'], 't':['i', 's'], 'u':['s', 'j'],
> +                    'o':['n'], 'p':['o'], 'q':['p'],
>                      'r':['q'], 's':['r'],
>                      }

For slightly improved clarity, these should be listed in alphabetical
order like all the other graph definitions.

> @@ -264,6 +410,10 @@
>          self.calls.extend(nodes)
>          return self._real_parents_provider.get_parent_map(nodes)
>  
> +    def get_parent_map(self, nodes):
> +        self.calls.extend(nodes)
> +        return self._real_parents_provider.get_parent_map(nodes)
> +

This duplication was mentioned in my earlier review I think.

> +    def test__remove_simple_descendants(self):
> +        graph = self.make_graph(ancestry_1)
> +        self.assertRemoveDescendants(set(['rev1']), graph,
> +            set(['rev1', 'rev2a', 'rev2b', 'rev3', 'rev4']))
> +
> +    def test__remove_simple_descendants_disjoint(self):
> +        graph = self.make_graph(ancestry_1)
> +        self.assertRemoveDescendants(set(['rev1', 'rev3']), graph,
> +            set(['rev1', 'rev3']))
> +
> +    def test__remove_simple_descendants_chain(self):
> +        graph = self.make_graph(ancestry_1)
> +        self.assertRemoveDescendants(set(['rev1']), graph,
> +            set(['rev1', 'rev2a', 'rev3']))
> +
> +    def test__remove_simple_descendants_siblings(self):
> +        graph = self.make_graph(ancestry_1)
> +        self.assertRemoveDescendants(set(['rev2a', 'rev2b']), graph,
> +            set(['rev2a', 'rev2b', 'rev3']))
> +

Is there any reason for the double underscore in these method names?

BTW, I really like the new show_pending_merges() code and the better
display of ghosts it provides. The new tests are great as well. Thanks.

Ian C.