[RFC] Non-memoisable BreadthFirstSearcher

John Arbash Meinel john at arbash-meinel.com
Fri Apr 25 03:39:56 BST 2008

Hash: SHA1

I've been working on the find_differences/find_unique_ancestors code,
and I'm finding that quite a few of the changes to _BreadthFirstSearcher
made it slower.

For example, in "start_searching()" there is a call to _do_query(),
because it wants to know if there are any ghosts.

However, if I remove that _do_query() call, the average search time
drops from 0.097s => 0.063s.
Or almost 30% of the time is spent in that _do_query().

Some of that might be how I've structured my code. Specifically, I call
start_searching() with the assumption that any nodes that have have
already been searched will be trivially discarded. (I did that, because
otherwise I have to add another parent graph search to find what nodes
are trivial to discard.)

Anyway, I can't say that I fully understand Robert's changes, to make
sure that I don't break anything that he was trying to do.

So my feeling was to write a new searcher, which would just be the bare
minimum. I would also even consider having some aggregate search
objects, because there is a bit of overhead in walking the same graph
nodes multiple times.

Part of the problem is that I'll have up to 30 or so searchers going.
And that is because the only way to know that you've checked enough
common revisions is when all common search revisions are ancestors of
*all* non-common revisions. So you have to search a large set of the
unique revisions (the ones that aren't trivially ancestors of others).
[This is better for find_unique_ancestors, but it really hurts
find_differences() on the bzr.dev code base].

Then again, if everything is fully tested that Robert did, after my
changes all the tests still pass.

Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the bazaar mailing list