Rev 2653: (John Arbash Meinel) Make get_revision_graph() ask the versioned file, fix a performance bug in VF.get_graph([revision_id]) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Thu Jul 26 14:43:57 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2653
revision-id: pqm at pqm.ubuntu.com-20070726134355-tlidmsn3eux09idz
parent: pqm at pqm.ubuntu.com-20070725140043-22lenkarm0oc3tvx
parent: john at arbash-meinel.com-20070725212922-a72okcvs6nahrz5t
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2007-07-26 14:43:55 +0100
message:
  (John Arbash Meinel) Make get_revision_graph() ask the versioned file, fix a performance bug in VF.get_graph([revision_id])
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/repofmt/knitrepo.py     knitrepo.py-20070206081537-pyy4a00xdas0j4pf-1
  bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
    ------------------------------------------------------------
    revno: 2652.1.2
    merged: john at arbash-meinel.com-20070725212922-a72okcvs6nahrz5t
    parent: john at arbash-meinel.com-20070725212630-31m6ichxpr1f8mlk
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: remove_get_revision_graph_redundancy
    timestamp: Wed 2007-07-25 16:29:22 -0500
    message:
      NEWS entry for both fixes
    ------------------------------------------------------------
    revno: 2652.1.1
    merged: john at arbash-meinel.com-20070725212630-31m6ichxpr1f8mlk
    parent: pqm at pqm.ubuntu.com-20070725140043-22lenkarm0oc3tvx
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: remove_get_revision_graph_redundancy
    timestamp: Wed 2007-07-25 16:26:30 -0500
    message:
      Avoid set.difference_update(other) because it is slow when other is big.
      Also, use a_weave.get_graph() rather than re-implementing it in
      knitrepo.py. Further, don't call get_graph() if you aren't going to use it.
=== modified file 'NEWS'
--- a/NEWS	2007-07-25 00:52:21 +0000
+++ b/NEWS	2007-07-25 21:29:22 +0000
@@ -83,6 +83,15 @@
     * Commit now only shows the progress in terms of directories instead of
       entries. (Ian Clatworthy)
 
+    * Fix ``KnitRepository.get_revision_graph`` to not request the graph 2
+      times. This makes ``get_revision_graph`` 2x faster. (John Arbash
+      Meinel)
+
+    * Fix ``VersionedFile.get_graph()`` to avoid using
+      ``set.difference_update(other)``, which has bad scaling when
+      ``other`` is large. This improves ``VF.get_graph([version_id])`` for
+      a 12.5k graph from 2.9s down to 200ms. (John Arbash Meinel)
+
   LIBRARY API BREAKS:
 
     * Deprecated dictionary ``bzrlib.option.SHORT_OPTIONS`` removed.

=== modified file 'bzrlib/repofmt/knitrepo.py'
--- a/bzrlib/repofmt/knitrepo.py	2007-07-25 00:52:21 +0000
+++ b/bzrlib/repofmt/knitrepo.py	2007-07-25 21:26:30 +0000
@@ -146,22 +146,13 @@
             return {}
         revision_id = osutils.safe_revision_id(revision_id)
         a_weave = self._get_revision_vf()
-        entire_graph = a_weave.get_graph()
         if revision_id is None:
             return a_weave.get_graph()
         elif revision_id not in a_weave:
             raise errors.NoSuchRevision(self, revision_id)
         else:
             # add what can be reached from revision_id
-            result = {}
-            pending = set([revision_id])
-            while len(pending) > 0:
-                node = pending.pop()
-                result[node] = tuple(a_weave.get_parents(node))
-                for revision_id in result[node]:
-                    if revision_id not in result:
-                        pending.add(revision_id)
-            return result
+            return a_weave.get_graph([revision_id])
 
     @needs_read_lock
     def get_revision_graph_with_ghosts(self, revision_ids=None):

=== modified file 'bzrlib/versionedfile.py'
--- a/bzrlib/versionedfile.py	2007-07-25 00:52:21 +0000
+++ b/bzrlib/versionedfile.py	2007-07-25 21:26:30 +0000
@@ -399,8 +399,10 @@
             pending = set()
             for version, parents in self.iter_parents(this_iteration):
                 result[version] = parents
-                pending.update(parents)
-            pending.difference_update(result)
+                for parent in parents:
+                    if parent in result:
+                        continue
+                    pending.add(parent)
         return result
 
     def get_graph_with_ghosts(self):




More information about the bazaar-commits mailing list