RFC: dirstate-key-cache?

John Arbash Meinel john at arbash-meinel.com
Tue Sep 25 21:50:06 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> I wonder if having a dict of relpath:dirstate-key, would be a win?
> 
> (if this is too brief to make sense of, you're probably not familiar
> enough with the code in question to provide the sort of answer I'm
> looking for :))
> 
> -Rob

Generally... no. I tried using one for bisect, and it turned out to be
pretty much a wash. Try using:

bzr selftest --benchmark dirstate

There should be test_bisect_dirblock_cached_py and test_bisect_dirblock_py.

Basically it saves a cache of the split form. And at least on my
systems, it is *slower* by about 8%.

test_bisect_dirblock_c          49ms/    4684ms
test_bisect_dirblock_cached_py 880ms/    5951ms
test_bisect_dirblock_py        837ms/    5041ms

Now it might be better if we cached it all up front, and then assumed we
would never get a cache miss (no try/except, etc).

You can test that by changing the function slightly to fill the cache
before it calls the bisect function. However, for most of the benchmark
the cache should be fairly full (it caches for every key it has to
compare, and it compares every key). It may actually be more optimal
than caching everything up front, since the dictionary should only
contain the "power of 2" keys, which should make it smaller. (Sure it is
a O(1) hash dictionary, but I don't think that makes it *strictly* size
independent. Memory misses at least cause some O(N/log(N)) behavior.)

I'm not sure what you need to use them for, either. Since we have the
optimized cmp_by_dirs and cmp_path_by_dirblock functions. Which can do
the comparison without having to .split() at all. (Which is one of the
reasons why the C bisect is 17x faster.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG+XR+JdeBCYSNAAMRAo+zAKDQB728IeSB8o4OcB15NfSX60FinQCeIK0c
IOa5ZriYPjdoTe4779u/Rto=
=5elJ
-----END PGP SIGNATURE-----



More information about the bazaar mailing list