Knit record cache considered harmful

Aaron Bentley aaron.bentley at utoronto.ca
Fri Jun 23 05:35:45 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

Many people have noted that building large trees eats gobs and gobs of
memory.  I suspected the problem was knit caching.

Michael Ellerman was experiencing this problem, so I sent him a patch
that cleared the knit caches for each file, during the tree build
process.  Here are his results:

http://michael.ellerman.id.au/files/comparison.png

Not only was much less memory used, but the process finished much more
quickly (~150 seconds).  Michael thinks this might be because allocating
a lot of memory would have caused excessive swapping.

The problem with this cache is that it isn't really a cache.  There is
no cache expiry mechanism.  So things just build up, and build up.

Knit record caches are only an advantage if we want multiple versions of
the file (or inventory, or revision), but only a few operations (e.g.
log, annotate, merge) follow that pattern.  Most just use the knit to
get one version.  And any command that wants more than one version can
ask for them all at once, to get similar efficiency.

I've run our benchmarks, and disabling the caches doesn't have much
effect on most of them.  The biggest difference seems to be in
test_make_kernel_like_tree, and it's not reproduceable.

So one option is to disable the cache permanently.  Another is to make
the cache behave like a proper cache, with limits on size.  Another is
to make the cache off_by_default.

But I think we need to do something, because right now, the Knit API has
a big gotcha in it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEm2+h0F+nu1YWqI0RAhmzAJ0SUooMf9LDzaBrPWTSv3yiJdakOwCeKhjl
vcERvk71WNPgf5mu4wv2HvQ=
=69fj
-----END PGP SIGNATURE-----




More information about the bazaar mailing list