Knit record cache considered harmful

Fri Jun 23 09:33:18 BST 2006

On 6/23/06, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
> Many people have noted that building large trees eats gobs and gobs of
> memory.  I suspected the problem was knit caching.
>
> Michael Ellerman was experiencing this problem, so I sent him a patch
> that cleared the knit caches for each file, during the tree build
> process.  Here are his results:
>
> http://michael.ellerman.id.au/files/comparison.png
>
> Not only was much less memory used, but the process finished much more
> quickly (~150 seconds).  Michael thinks this might be because allocating
> a lot of memory would have caused excessive swapping.

Actually ignore the timings on the graph, I only ran each test once
(cause it takes so long and hoses your machine), so they're dicey.

I just ran some more tests to investigate it. It looks like it's
purely the page cache flushing effect that's causing the slow down.

Timings for three consecutive branch runs:

Current bzr:
real    6m15.945s
real    7m41.221s
real    8m3.688s

Patched bzr:
real    7m46.353s
real    2m36.174s
real    2m42.719s

So for a cold page cache, the patch makes no difference. But without
the patch it has the effect of flushing enough stuff out of the page
cache that subsequent runs are no faster. This effect will depend
heavily on how much memory people have, and only helps if they run
more than one op on the same branch.

So it's still nice, but it's not going to be a big win. What _is_ a
big win is the reduction in memory usage for its own sake.

> The problem with this cache is that it isn't really a cache.  There is
> no cache expiry mechanism.  So things just build up, and build up.
>
> Knit record caches are only an advantage if we want multiple versions of
> the file (or inventory, or revision), but only a few operations (e.g.
> log, annotate, merge) follow that pattern.  Most just use the knit to
> get one version.  And any command that wants more than one version can
> ask for them all at once, to get similar efficiency.

I ran a few quick tests of branch and log over http, and the _only_
place I see us hitting the KnitData cache is when grabbing the
inventory weave.

So I think we should rip it out and perhaps cache the inventory weave
at a higher level, if we can prove it is a performance win.

cheers