Direction on faster iteritems
John Arbash Meinel
john at arbash-meinel.com
Thu Mar 19 04:38:03 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Clatworthy wrote:
> Hi John,
>
> I've been working on making iteritems faster following
> your email advice the other day. I'm not sure I'm on
> the right track though, e.g. perhaps I'm not returning
> the right information out of iternodes?
>
> I started pushing my branch to lp 6-7 hours ago and it's
> *still* going so I'm sending you this patch to glance over
> when you get a chance. It's very much a Work In Progress
> but there's enough there to see how I've interpreted your
> email. :-)
>
> Can you see anything I'm doing that's obviously wrong?
>
> Ian C.
>
def __repr__(self):
- - items_str = sorted(self._items)
+ items_str = str(sorted(self._items))
if len(items_str) > 20:
- - items_str = items_str[16] + '...]'
+ items_str = items_str[16:] + '...]'
return \
^- I'm sure you want [:16] rather than [16:] :)
- - def iteritems(self, store, key_filter=None):
+ def iteritems(self, store, key_filter=None, matched_key=None):
^- I'm not 100% sure what you are thinking to do with "matched_key". My
idea was to prune key_filter before calling LeafNode.iteritems() rather
than doing it as an extra parameter.
...
+ matched = {}
+ for node in self._iter_nodes(store, key_filter=key_filter,
+ matched_keys=matched):
^- My idea here was to change the return value for "_iter_nodes" so that
it would return:
for node, node_key_filter in self._iter_nodes(...):
for item in node.iteritems(key_filter=node_key_filter):
yield item
And then change the _iter_nodes matching code, so that it maps from the
'length_filter' which matched back to an original key. So you would have:
prefix_to_keys = {}
for key in key_filter:
search_key = self._search_prefix_filter(key)
try:
length_filters[len(search_key)].add(length_filter)
except KeyError:
length_filters[len(search_key)] = set([search_key])
try:
prefix_to_keys[search_key].append(key)
except KeyError:
prefix_to_keys[search_key] = [key]
...
node_key_filters = []
for length, length_filter in length_filters:
prefix_start = prefix[:length]
if prefix_start in length_filter:
node_key_filters.extend(prefix_to_keys[prefix_start])
if node_key_filters:
#queue up items that aren't nodes yet
yield node, node_key_filters
And the other major change for LeafNode.iteritems()
narrow_key_filters = []
for key in key_filters:
if len(key) == self._node_width:
try:
yield self._items[key]
except KeyError:
# that key doesn't exist here
pass
else:
narrow_key_filters.append(key)
if narrow_key_filters:
# This is stuff like (parent_id,) for a (parent_id, basename) map
# Create another set of 'length' based filters, but this is based on
# the key width, and not the length of the serialized form.
...
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknBzCsACgkQJdeBCYSNAANYeQCfYBzAOQUa+enfhl75ffVXM9Y8
//kAn068o6Hcj7nVHuBSvCTg47RbRmJ8
=ivpu
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list