Rev 6029: Instrumentation is revealing. in http://bazaar.launchpad.net/~jameinel/bzr/2.4-too-much-walking-388269
John Arbash Meinel
john at arbash-meinel.com
Fri Aug 12 08:39:44 UTC 2011
At http://bazaar.launchpad.net/~jameinel/bzr/2.4-too-much-walking-388269
------------------------------------------------------------
revno: 6029
revision-id: john at arbash-meinel.com-20110812083925-lnz0pz1auiwgstcv
parent: john at arbash-meinel.com-20110811154420-nrcv83c8lh3yl6c3
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.4-too-much-walking-388269
timestamp: Fri 2011-08-12 10:39:25 +0200
message:
Instrumentation is revealing.
Under lsprof it takes 21s to do the fetch with depth=100 and size=100. We have 16 calls to
limited_search_result_from_parent, and only 1 call to 'search_result_from_parent_map'.
What is especially revealing is that 13.7s is spent in the RPC, which means that
depth=100 is expensive on the *server* side, not the client side.
Also interesting is:
5.6802 bzrlib.graph:1986(limited_search_result_from_parent_map)
1.9023 +bzrlib.graph:1937(_find_possible_heads)
2.6919 +bzrlib.graph:2017(first_search)
0.0928 +bzrlib.graph:1317(get_result)
0.9129 +bzrlib.graph:2021(second_search)
For some reason, the search with fewer heads is a lot faster than the original wide search.
Probably an inefficiency with BreadthFirstSearcher.
_find_possible_heads can also probably be tuned:
1.9023 bzrlib.graph:1937(_find_possible_heads)
0.5954 +<method 'difference_update' of 'set' objects>
0.4819 +bzrlib.graph:1921(invert_parent_map)
0.3779 +<method 'update' of 'set' objects>
0.0006 +<method 'add' of 'set' objects>
invert_parent_map doesn't dominate, and we have a lot of calls to difference_update, etc.
-------------- next part --------------
=== modified file 'bzrlib/graph.py'
--- a/bzrlib/graph.py 2011-08-11 15:30:30 +0000
+++ b/bzrlib/graph.py 2011-08-12 08:39:25 +0000
@@ -2014,10 +2014,14 @@
:param depth: How far back to walk.
"""
heads = _find_possible_heads(parent_map, tip_keys, depth)
- s, found_heads = _run_search(parent_map, heads, set(tip_keys))
+ def first_search():
+ return _run_search(parent_map, heads, set(tip_keys))
+ s, found_heads = first_search()
if found_heads:
- s2, extra_heads = _run_search(parent_map, heads - found_heads,
+ def second_search():
+ return _run_search(parent_map, heads - found_heads,
set(tip_keys))
+ s2, extra_heads = second_search()
result1 = s.get_result()
result2 = s2.get_result()
assert result1._keys == result2._keys
=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py 2011-08-11 15:44:20 +0000
+++ b/bzrlib/remote.py 2011-08-12 08:39:25 +0000
@@ -1748,7 +1748,7 @@
(start_set, stop_keys,
key_count) = graph.limited_search_result_from_parent_map(
parents_map, self._unstacked_provider.missing_keys,
- keys, depth=1)
+ keys, depth=100)
else:
(start_set, stop_keys,
key_count) = graph.search_result_from_parent_map(parents_map,
=== modified file 'bzrlib/vf_repository.py'
--- a/bzrlib/vf_repository.py 2011-07-21 18:14:42 +0000
+++ b/bzrlib/vf_repository.py 2011-08-12 08:39:25 +0000
@@ -2503,8 +2503,17 @@
:param revision_ids: The start point for the search.
:return: A set of revision ids.
"""
- import pdb; pdb.set_trace()
+ note('starting')
t = time.time()
+ from bzrlib import commands
+ ret = commands.apply_lsprofiled(',,profile.txt',
+ self._do_walk_to_common_revisions, revision_ids,
+ if_present_ids=if_present_ids)
+ note('Walking took %.3fs' % (time.time() - t))
+ sys.exit(1)
+ return ret
+
+ def _do_walk_to_common_revisions(self, revision_ids, if_present_ids=None):
target_graph = self.target.get_graph()
revision_ids = frozenset(revision_ids)
if if_present_ids:
@@ -2562,9 +2571,6 @@
searcher.stop_searching_any(stop_revs)
if searcher_exhausted:
break
- note('Walking took %.3fs, with %d steps'
- % (time.time() - t, search_step))
- import pdb; pdb.set_trace()
return searcher.get_result()
@needs_read_lock
More information about the bazaar-commits
mailing list