[MERGE] Better LRUCache implementation
John Arbash Meinel
john at arbash-meinel.com
Tue Mar 24 14:28:41 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Clatworthy wrote:
> John Arbash Meinel wrote:
>
>> I realized that my submission still has some "assert" statement, which
>> I'll certainly clean out before I submit.
>
> Please. There's also an "import pdb; ..." in _remove_node which needs to go.
> I'd be tempted to remove the asserts altogether in _record_access as that
> code path ought to be hit often, but I'll let you make the call on that.
Yeah, I did.
>
> bb:tweak
>
...
>
>> - # This approximates that texts are > 0.5k in size. It only really
>> - # effects when we clean up the queue, so we don't want it to be too
>> - # large.
>> self._update_max_size(max_size, after_cleanup_size=after_cleanup_size)
>> LRUCache.__init__(self, max_cache=max(int(max_size/512), 1))
>
> This int(max_size/512) magic appears twice in the code and it's a complete
> mystery to me. I gather that comment was trying to explain it in one of
> those places. For the sake of code clarity, I'd appreciate it if that magic
> was moved into a method that explained the basis for the heuristic.
>
So it actually doesn't matter anymore, because it isn't used. Originally
we had a 'deque()' that tracked the number of accesses, which would
then be 'flushed' periodically. (based approximately on max_size*4). I
think I can actually set it to 0 now, I'll check on that.
>> value_len = self._compute_size(value)
>> if value_len >= self._after_cleanup_size:
>> + if node is not None:
>> + # The new value is too large, we won't be replacing the value,
>> + # just removing the node completely
>> + self._remove_node(node)
>> return
>
> Hmm. I was uncomfortable about this at first but I think your logic here
> is sound. I wonder if it's worthy of a mutter() so we get a clue that the
> cache size needs consideration/tuning?
I went ahead and added it. I also realized that we need to call
cleanup() if it exists at this point. (The use case for cleanup is stuff
like Branch.unlock(), and I think it is best to assume that if you have
called add() the cleanup will always fire.)
>
>> - self.assertEqual([1, 2, 3, 4, 5, 2, 5, 3, 2], list(cache._queue))
>> - self.assertEqual({1:1, 2:3, 3:2, 4:1, 5:2}, cache._refcount)
>> -
>> - # Compacting should save the last position
>> - cache._compact_queue()
>> - self.assertEqual([1, 4, 5, 3, 2], list(cache._queue))
>> - self.assertEqual({1:1, 2:1, 3:1, 4:1, 5:1}, cache._refcount)
>> + self.assertEqual([2, 3, 5, 4, 1], [n.key for n in cache._walk_lru()])
>>
>
> I'm happy with your changes to the tests but test_compact_preserves_last_access_order()
> needs renaming if the test is no longer testing that. :-)
>
> Otherwise, well done on this.
>
> Ian C.
>
Thanks,
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknI7hkACgkQJdeBCYSNAAP5OACfVtt43jCm8pYACmZT/nnjZ8Rg
72MAnjycVJYjoKsPJGv7IurSWK39brf7
=etgc
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list