fix for a keyerror in finding least common ancestors

Tue Jul 31 11:17:50 BST 2007

Robert Collins wrote:
> On Mon, 2007-07-30 at 09:16 -0400, Aaron Bentley wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
> 
>>> I think the most obvious fix (attached) is actually the right fix (it's
>>> not at all obvious to me that the candidate being deleted here hasn't
>>> already been deleted by line 249).
>> I don't agree.  There's only one place that deletes an active searcher:
>> line 249.  'candidate' is a key in the active_searchers dictionary, so
>> there can't be two entries for
>> 'pqm at pqm.ubuntu.com-20070712051010-4o5ij0wgrvhvdf0o'.
>>
>> It's obvious to me that if we are walking through active_searchers, and
>> we hit a StopIteration, then candidate must be present in
>> active_searchers.  The fact that this can fail suggests a deeper logic
>> bug, and I don't want to apply a band-aid.  I want to understand and
>> solve the problem.
> 
> Agreed.
> 
>>> In a half an hour of trying, I
>>> haven't been able to make a test case
>> That's a shame, and considering it's Launchpad, I can't just download it
>> myself to reproduce.
> 
> Michael, can you do the following:
> from bzrlib import branch, repository
> b1 = branch.Branch.open('.')
> b2 = branch.Branch.open(MERGE_SOURCE)
> graph1 =
> b1.repository.get_revision_graph_with_ghosts(b1.last_revision())
> graph2 =
> b2.repository.get_revision_graph_with_ghosts(b2.last_revision())
> 
> Now, because there are arch ids in the deep history, its possible that
> there is sensitive info (such as old branch names), so check with your
> manager before sending the output anywhere. But - graph1 and graph2
> should be enough to hook up to the LCA logic as a test case, with a
> different parent-getter routine.

This 3-line but 600-odd kilobyte file:

    http://python.net/crew/mwh/g.txt

contains:

 - on line 1, a sanitized-revid repr() of the _graph_ancestors of the
graph with ghosts of the repository that I can reproduce the problem with
 - on lines 2 and 2 the revids that should be used as arguments to
find_unique_lca.

I tried to make a test case like this:

    def test_graph_lca_huge(self):
        import os
        f = open(os.path.join(os.path.dirname(__file__), 'g.txt'))
        graph_repr = f.readline()
        revid1 = f.readline()
        revid2 = f.readline()
        graph = self.make_graph(eval(graph_repr))
        graph.find_unique_lca(revid1, revid2)

but this has been running for ~3 hours now, so another approach is
probably required :-)

Hope this is useful.

Cheers,
mwh