Rev 5375: Do some timing with different objects. in http://bazaar.launchpad.net/~jameinel/bzr/2.3-btree-chk-leaf
John Arbash Meinel
john at arbash-meinel.com
Wed Aug 4 03:00:54 BST 2010
At http://bazaar.launchpad.net/~jameinel/bzr/2.3-btree-chk-leaf
------------------------------------------------------------
revno: 5375
revision-id: john at arbash-meinel.com-20100804020038-teb38durhgj6psxj
parent: john at arbash-meinel.com-20100804015214-5dbdub9p0qyc07dp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.3-btree-chk-leaf
timestamp: Tue 2010-08-03 21:00:38 -0500
message:
Do some timing with different objects.
It turns out that pure dict is 13.7us lookup time, dict-as-attr is 20.7us,
PyObject __getitem__ to dict-as-attr is 54us.
-------------- next part --------------
=== modified file 'bzrlib/_btree_serializer_pyx.pyx'
--- a/bzrlib/_btree_serializer_pyx.pyx 2010-08-04 01:52:14 +0000
+++ b/bzrlib/_btree_serializer_pyx.pyx 2010-08-04 02:00:38 +0000
@@ -567,6 +567,15 @@
hi = self.num_entries
# Bisect until we find the right spot
+ # Note the time to create Python objects dominates over comparison
+ # times. With a simple iterate and compare (no bisect) we spend 45us
+ # for 120 __contains__, but 201us for 120 __getitem__.
+ # The _LeafNode.__getitem__ takes 54.6us, but (k in o._keys) is 20.7us
+ # removing the getattr() w/ (k in _keys) 13.7ms. So we do still have
+ # some room for improvement
+ # TODO: Consider improving this, but for now it seems good enough. (We
+ # could create a small index so we would have less to bisect,
+ # etc.)
while lo < hi:
mid = (lo + hi) / 2
the_cmp = memcmp(sha1, self.entries[mid].sha1, 20)
More information about the bazaar-commits
mailing list