[MERGE] [BUG #172657][1.0] make 'bzr status' after merge faster

John Arbash Meinel john at arbash-meinel.com
Fri Dec 7 21:23:33 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> bb:tweak
> 
> I don't have many comments.  The code seems pretty clear (as far as heavily
> algorithmic code goes, anyway...).  The main concern is making known bug that
> Aaron pointed out clear in a comment.

I did find one serious typo, which was that a variable was re-used. I had one
loop for:

for merge in pending:

  for merge, parents in zip(merge_extra, get_parents(merge_extra)):
    ...

  sorter = tsort.MergeSorter(merge_graph, merge)

This would cause it to randomly pick a revision in the ancestry as the
top-level merge item, rather than always picking the pending merge.

I changed the variable to 'extra'. This bogus bit might have been caught by the
extended test suite, but it depended on the sort order of a set. So all cases
might have just happened to succeed.

The only real way I can see to trap *that specific* bug would be to do
something like:
 A     A
 |\    |\
 | B   | C
 | |   | |
 | C   | B
 |/    |/
 D     D

By testing both permutations of sorting you should be guaranteed that one of
them is wrong. But I don't know that it is worth testing for that... (so I didn't)


...

>> +        # Find all of the revisions in the merge source, which are not in the
>> +        # last committed revision.
>> +        # We don't care about last_extra
>> +        last_extra, merge_extra = graph.find_difference(last_revision, merge)
> 
> I've seen Aaron's post, and chatted a little with you about it.  It sounds like
> the bug here is acceptable: the misbehaviour is only cosmetic (status will
> report extra pending merges in certain cases) rather than something severe like
> data loss.  So I'm ok with merging this.
> 
> But this call needs a great big XXX comment above making it clear that there's a
> known bug here, when it will occur, and what the effect will be.  I don't want
> dangerous code to look harmless.

In talking with Ian about it, I think I've found when it could happen.
The specific case occurs because on searcher stops because it reaches the end
of the graph, rather than stopping because all nodes were found to be "common".
Which has a side-effect of leaving NULL_REVISION in the "these nodes belong to
side X". NULL_REVISION is *always* a common ancestor by definition.

So I went with a "detect if NULL_REVISION is present, if so fallback to slow
get_ancestry()" with a mutter() in .bzr.log about it. I thought about adding a
warning to the user about it, but I was following Martin's "if we know it is a
bug, we shouldn't spew it to the user, because we already know we need to fix it.".

At least with mutter() if a user sees something awful they can at least poke
around and see that we are aware of it and are using a workaround.

> 
> [...]
>> +            rev_id_iterator = sorter.iter_topo_order()
>> +            num, first, depth, eom = rev_id_iterator.next()
>> +            assert first == merge
> 
> It might be good to add a message to that assert that includes the repr of first
> and merge, in case something ever does trip it.  That way we'd have a little
> more information to debug with.

Done.

> 
>> === modified file 'bzrlib/tests/test_status.py'
> 
> Thanks for improving test coverage even though this is “just” a performance
> patch!
> 
> [...]
>>  
>> +    def test_multiple_pending(self):
> 
> “multiple pending” is a bit vague for what this test does.  Specifically the
> edge case it's testing is the behaviour of status when multiple pending merges
> introduce the same revision.  It's be good if the name and/or docstring of this
> test method made that clear.
> 
> (The other problem with vague test method names is it tempts people to add more
> tests into the method, rather than making a distinct method for it, which is
> usually clearer.)

I clarified this mostly with a doc-string. Is that reasonable?

> 
>> +        config.GlobalConfig().set_user_option('email', 'Joe Foo <joe at foo.com>')
>> +        tree = self.make_branch_and_tree('a')
>> +        tree.commit('commit 1', timestamp=1196796819, timezone=0)
> 
> Could you make a variable like "dec_4th_2007" for that timestamp?  At the moment
> it's a magic number with an obscure purpose.

I added "date_2007_12_04".

> 
>> +        tree2 = tree.bzrdir.clone('b').open_workingtree()
>> +        tree.commit('commit 2', timestamp=1196796819, timezone=0)
>> +        tree2.commit('commit 2b', timestamp=1196796819, timezone=0)
>> +        tree3 = tree.bzrdir.clone('c').open_workingtree()
>> +        tree2.commit('commit 3b', timestamp=1196796819, timezone=0)
>> +        tree3.commit('commit 3c', timestamp=1196796819, timezone=0)
> 
> It might be clearer to rename “commit 1” to “commit 1a” and “commit 2” to
> “commit 2a”, for consistency.

Done.

> 
> Also, rename tree/tree2/tree3 to tree_a, tree_b and tree_c so they're named
> after their directories.
> 
> Then it'll be trivial to see the correspondence between commit message and which
> tree the commit is made in.  It's slightly confusing at the moment.

Done.

> 
>> +        tree.merge_from_branch(tree2.branch)
>> +        tree.merge_from_branch(tree3.branch)
>> +        output = StringIO()
>> +        tree.lock_read()
>> +        try:
>> +            show_pending_merges(tree, output)
>> +        finally:
>> +            tree.unlock()
>> +        # Even though 2b is merged by 3c also, it should only be displayed
>> +        # the first time it shows u.
> 
> “shows up”, maybe?

Correct, but I switched and went with "appears".

> 
>> +        self.assertEqual('pending merges:\n'
>> +                         '  Joe Foo 2007-12-04 commit 3b\n'
>> +                         '    Joe Foo 2007-12-04 commit 2b\n'
>> +                         '  Joe Foo 2007-12-04 commit 3c\n',
>> +                         output.getvalue())
> 
> Looks good!
> 
> -Andrew.

Attached is an update, which includes the find_differences workaround.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHWbnVJdeBCYSNAAMRAjpvAJ9aJ0qXyNbLpK6ftLc5n2uDSi6MdgCfYtNI
fbxxPb9CcWjZ3Q01t3CEH58=
=biis
-----END PGP SIGNATURE-----



More information about the bazaar mailing list