Rev 3979: Change the Knit.get_record_stream() to batch by 100 file texts at a time. in lp:///~jameinel/bzr/knit_batching

John Arbash Meinel john at arbash-meinel.com
Tue Feb 3 21:52:20 GMT 2009


At lp:///~jameinel/bzr/knit_batching

------------------------------------------------------------
revno: 3979
revision-id: john at arbash-meinel.com-20090203215215-xr7neeokak62rd7j
parent: pqm at pqm.ubuntu.com-20090202091414-4q20mjzsvp03vyfc
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: knit_batching
timestamp: Tue 2009-02-03 15:52:15 -0600
message:
  Change the Knit.get_record_stream() to batch by 100 file texts at a time.
-------------- next part --------------
=== modified file 'bzrlib/knit.py'
--- a/bzrlib/knit.py	2009-01-29 23:01:03 +0000
+++ b/bzrlib/knit.py	2009-02-03 21:52:15 +0000
@@ -1285,9 +1285,21 @@
             non_local_keys = needed_from_fallback - absent_keys
             prefix_split_keys = self._split_by_prefix(present_keys)
             prefix_split_non_local_keys = self._split_by_prefix(non_local_keys)
+            batched = []
+            cur_keys = set()
+            cur_non_local = set()
             for prefix, keys in prefix_split_keys.iteritems():
                 non_local = prefix_split_non_local_keys.get(prefix, [])
-                non_local = set(non_local)
+                cur_keys.update(keys)
+                cur_non_local.update(non_local)
+                # ??? Are the values in cur_keys always a superset of
+                # cur_non_local?
+                if len(cur_keys) + len(cur_non_local) > 100:
+                    batched.append((cur_keys, cur_non_local))
+                    cur_keys = set()
+                    cur_non_local = set()
+            batched.append((cur_keys, cur_non_local))
+            for keys, non_local in batched:
                 text_map, _ = self._get_content_maps(keys, non_local)
                 for key in keys:
                     lines = text_map.pop(key)



More information about the bazaar-commits mailing list