Rev 3827: Do a first-pass over the data during _check_remap. in http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack
John Arbash Meinel
john at arbash-meinel.com
Wed Dec 24 19:28:24 GMT 2008
At http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack
------------------------------------------------------------
revno: 3827
revision-id: john at arbash-meinel.com-20081224192805-lpi28zkruumu4o3y
parent: john at arbash-meinel.com-20081224192457-80lac61kjh2lin23
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: hack
timestamp: Wed 2008-12-24 13:28:05 -0600
message:
Do a first-pass over the data during _check_remap.
This is meant to help when we obviously would not be able to collapse, without
actually having to map any nodes into a new leaf node.
Actual effect seems small, even though lsprof shows it is cutting out 1/3rd of
the map_no_split calls.
-------------- next part --------------
=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py 2008-12-24 19:24:57 +0000
+++ b/bzrlib/chk_map.py 2008-12-24 19:28:05 +0000
@@ -1188,6 +1188,26 @@
# and cause size changes greater than the length of one key.
# So for now, we just add everything to a new Leaf until it
# splits, as we know that will give the right answer
+ # Do one quick pass to see if it is even remotely possible for things
+ # to collapse
+ # This is a very wide band. We assume that if we have enough leaf nodes
+ # that their total length > 3 times, then we won't be able to fit that
+ # onto a single leaf node, even with prefix compression.
+ threshold_size = self._maximum_size * 2
+ total_size = 0
+ for prefix, node in self._items.iteritems():
+ if type(node) == tuple:
+ continue
+ elif isinstance(node, InternalNode):
+ # Without looking at any leaf nodes, we are sure
+ def child_is_internal_node(): pass
+ child_is_internal_node()
+ return self
+ total_size += node._current_size()
+ if total_size > threshold_size:
+ def total_size_greater_than_threshold(): pass
+ total_size_greater_than_threshold()
+ return self
new_leaf = LeafNode()
new_leaf.set_maximum_size(self._maximum_size)
new_leaf._key_width = self._key_width
@@ -1198,11 +1218,6 @@
if type(node) == tuple:
keys[node] = prefix
else:
- if isinstance(node, InternalNode):
- # Without looking at any leaf nodes, we are sure
- def child_is_internal_node(): pass
- child_is_internal_node()
- return self
for key, value in node._items.iteritems():
if new_leaf._map_no_split(key, value):
# Adding this key would cause a split, so we know we
More information about the bazaar-commits
mailing list