Rev 3827: Do a first-pass over the data during _check_remap. in http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack

John Arbash Meinel john at arbash-meinel.com
Wed Dec 24 19:28:24 GMT 2008


At http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack

------------------------------------------------------------
revno: 3827
revision-id: john at arbash-meinel.com-20081224192805-lpi28zkruumu4o3y
parent: john at arbash-meinel.com-20081224192457-80lac61kjh2lin23
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: hack
timestamp: Wed 2008-12-24 13:28:05 -0600
message:
  Do a first-pass over the data during _check_remap.
  
  This is meant to help when we obviously would not be able to collapse, without
  actually having to map any nodes into a new leaf node.
  Actual effect seems small, even though lsprof shows it is cutting out 1/3rd of
  the map_no_split calls.
-------------- next part --------------
=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py	2008-12-24 19:24:57 +0000
+++ b/bzrlib/chk_map.py	2008-12-24 19:28:05 +0000
@@ -1188,6 +1188,26 @@
         #       and cause size changes greater than the length of one key.
         #       So for now, we just add everything to a new Leaf until it
         #       splits, as we know that will give the right answer
+        # Do one quick pass to see if it is even remotely possible for things
+        # to collapse
+        # This is a very wide band. We assume that if we have enough leaf nodes
+        # that their total length > 3 times, then we won't be able to fit that
+        # onto a single leaf node, even with prefix compression.
+        threshold_size = self._maximum_size * 2
+        total_size = 0
+        for prefix, node in self._items.iteritems():
+            if type(node) == tuple:
+                continue
+            elif isinstance(node, InternalNode):
+                # Without looking at any leaf nodes, we are sure
+                def child_is_internal_node(): pass
+                child_is_internal_node()
+                return self
+            total_size += node._current_size()
+            if total_size > threshold_size:
+                def total_size_greater_than_threshold(): pass
+                total_size_greater_than_threshold()
+                return self
         new_leaf = LeafNode()
         new_leaf.set_maximum_size(self._maximum_size)
         new_leaf._key_width = self._key_width
@@ -1198,11 +1218,6 @@
             if type(node) == tuple:
                 keys[node] = prefix
             else:
-                if isinstance(node, InternalNode):
-                    # Without looking at any leaf nodes, we are sure
-                    def child_is_internal_node(): pass
-                    child_is_internal_node()
-                    return self
                 for key, value in node._items.iteritems():
                     if new_leaf._map_no_split(key, value):
                         # Adding this key would cause a split, so we know we



More information about the bazaar-commits mailing list