Rev 6029: Instrumentation is revealing. in http://bazaar.launchpad.net/~jameinel/bzr/2.4-too-much-walking-388269

John Arbash Meinel john at arbash-meinel.com
Fri Aug 12 08:39:44 UTC 2011


At http://bazaar.launchpad.net/~jameinel/bzr/2.4-too-much-walking-388269

------------------------------------------------------------
revno: 6029
revision-id: john at arbash-meinel.com-20110812083925-lnz0pz1auiwgstcv
parent: john at arbash-meinel.com-20110811154420-nrcv83c8lh3yl6c3
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.4-too-much-walking-388269
timestamp: Fri 2011-08-12 10:39:25 +0200
message:
  Instrumentation is revealing.
  
  Under lsprof it takes 21s to do the fetch with depth=100 and size=100. We have 16 calls to
  limited_search_result_from_parent, and only 1 call to 'search_result_from_parent_map'.
  What is especially revealing is that 13.7s is spent in the RPC, which means that
  depth=100 is expensive on the *server* side, not the client side.
  
  Also interesting is:
  5.6802  bzrlib.graph:1986(limited_search_result_from_parent_map)
  1.9023  +bzrlib.graph:1937(_find_possible_heads)
  2.6919  +bzrlib.graph:2017(first_search)
  0.0928  +bzrlib.graph:1317(get_result)
  0.9129  +bzrlib.graph:2021(second_search)
  
  For some reason, the search with fewer heads is a lot faster than the original wide search.
  Probably an inefficiency with BreadthFirstSearcher.
  _find_possible_heads can also probably be tuned:
  1.9023  bzrlib.graph:1937(_find_possible_heads)
  0.5954  +<method 'difference_update' of 'set' objects>
  0.4819  +bzrlib.graph:1921(invert_parent_map)
  0.3779  +<method 'update' of 'set' objects>
  0.0006  +<method 'add' of 'set' objects>
  
  invert_parent_map doesn't dominate, and we have a lot of calls to difference_update, etc.
-------------- next part --------------
=== modified file 'bzrlib/graph.py'
--- a/bzrlib/graph.py	2011-08-11 15:30:30 +0000
+++ b/bzrlib/graph.py	2011-08-12 08:39:25 +0000
@@ -2014,10 +2014,14 @@
     :param depth: How far back to walk.
     """
     heads = _find_possible_heads(parent_map, tip_keys, depth)
-    s, found_heads = _run_search(parent_map, heads, set(tip_keys))
+    def first_search():
+        return _run_search(parent_map, heads, set(tip_keys))
+    s, found_heads = first_search()
     if found_heads:
-        s2, extra_heads = _run_search(parent_map, heads - found_heads,
+        def second_search():
+            return _run_search(parent_map, heads - found_heads,
                                       set(tip_keys))
+        s2, extra_heads = second_search()
         result1 = s.get_result()
         result2 = s2.get_result()
         assert result1._keys == result2._keys

=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py	2011-08-11 15:44:20 +0000
+++ b/bzrlib/remote.py	2011-08-12 08:39:25 +0000
@@ -1748,7 +1748,7 @@
             (start_set, stop_keys,
              key_count) = graph.limited_search_result_from_parent_map(
                 parents_map, self._unstacked_provider.missing_keys,
-                keys, depth=1)
+                keys, depth=100)
         else:
             (start_set, stop_keys,
              key_count) = graph.search_result_from_parent_map(parents_map,

=== modified file 'bzrlib/vf_repository.py'
--- a/bzrlib/vf_repository.py	2011-07-21 18:14:42 +0000
+++ b/bzrlib/vf_repository.py	2011-08-12 08:39:25 +0000
@@ -2503,8 +2503,17 @@
         :param revision_ids: The start point for the search.
         :return: A set of revision ids.
         """
-        import pdb; pdb.set_trace()
+        note('starting')
         t = time.time()
+        from bzrlib import commands
+        ret = commands.apply_lsprofiled(',,profile.txt',
+            self._do_walk_to_common_revisions, revision_ids,
+            if_present_ids=if_present_ids)
+        note('Walking took %.3fs' % (time.time() - t))
+        sys.exit(1)
+        return ret
+
+    def _do_walk_to_common_revisions(self, revision_ids, if_present_ids=None):
         target_graph = self.target.get_graph()
         revision_ids = frozenset(revision_ids)
         if if_present_ids:
@@ -2562,9 +2571,6 @@
                 searcher.stop_searching_any(stop_revs)
             if searcher_exhausted:
                 break
-        note('Walking took %.3fs, with %d steps'
-             % (time.time() - t, search_step))
-        import pdb; pdb.set_trace()
         return searcher.get_result()
 
     @needs_read_lock



More information about the bazaar-commits mailing list