RFC: dirstate-key-cache?

Robert Collins robertc at robertcollins.net
Tue Sep 25 22:08:10 BST 2007


On Tue, 2007-09-25 at 15:50 -0500, John Arbash Meinel wrote:


> Basically it saves a cache of the split form. And at least on my
> systems, it is *slower* by about 8%.
> 
> test_bisect_dirblock_c          49ms/    4684ms
> test_bisect_dirblock_cached_py 880ms/    5951ms
> test_bisect_dirblock_py        837ms/    5041ms
> 
> Now it might be better if we cached it all up front, and then assumed we
> would never get a cache miss (no try/except, etc).
> 
> You can test that by changing the function slightly to fill the cache
> before it calls the bisect function. However, for most of the benchmark
> the cache should be fairly full (it caches for every key it has to
> compare, and it compares every key). It may actually be more optimal
> than caching everything up front, since the dictionary should only
> contain the "power of 2" keys, which should make it smaller. (Sure it is
> a O(1) hash dictionary, but I don't think that makes it *strictly* size
> independent. Memory misses at least cause some O(N/log(N)) behavior.)
> 
> I'm not sure what you need to use them for, either. Since we have the
> optimized cmp_by_dirs and cmp_path_by_dirblock functions. Which can do
> the comparison without having to .split() at all. (Which is one of the
> reasons why the C bisect is 17x faster.)

So, I lost 4 seconds of performance adding the stat cache back in, and
about 4 of those seconds are in _get_entry. Which is why I was wondering
about a cache, and the C extensions didn't help at all with the
performance there.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070926/efdcc3cc/attachment-0001.pgp 


More information about the bazaar mailing list