[patch] hashcache fixes

Mon Jul 10 01:10:13 BST 2006

On  9 Jul 2006, Robert Collins <robertc at robertcollins.net> wrote:
> On Sun, 2006-07-09 at 17:46 +1000, Martin Pool wrote:
> 
> Generally +1. One thing I'd like discussed before its committed, and a
> couple of thoughts.
> 
> The thoughts first: perhaps the working tree test_hashcache test should
> go away? With your changes the test is a no-op, its not checking that
> caching occurs (which is arguably something we want to test to prevent
> performance regressions), nor is it testing for race conditions that a
> working tree might experience. (I think the latter is worth testing as
> an aspect of workingtree - create a tree, a file, get its vital
> statistics, then modify it and check the tree notices.)

Yes, it probably should go away.  If we bring in a dirstate-style file
or some other large change to the working tree we should think about
adding such a test.  I'll delete it then.

> > +        if not hasattr(os, 'mkfifo'):
> > +            raise TestSkipped('filesystem fifos not supported on this
> > system')
> 
> This seems to be a case for just returning to me - its not something
> that the person running the testcase can correct. (Which is what we
> defined TestSkipped to mean).

No, I really don't think so.  Silently not running tests is really no
good - what if I misspelled 'mkfifo' and so it always returned?  When
Aaron's refinement of TestSkipped lands we can raise an error indicating
'we can't test this on your OS'. 

There's a useful semantic difference between "this passed" and "this
couldn't be tested".

We were talking the other day about trying to make sure that there are
zero skipped tests on fully-fledged developer machines, and if there are
any skips that indicates something is wrong.  The flip side is that we
should never return rather than skipping.

> +    def test_hashcache_hit_old_file(self):
> +        """An old file gives a cache hit"""
> +        return ### this takes too long to run properly; skipped
> 
> This seems to be something to raise TestSkipped on - it is something
> that can be changed to run.

OK, I can imagine skipping that as a slow test or something.

> That said, heres what I'd like to discuss - some of the tests for the
> hash cache seem fragile, because we're depending on the external system
> to line up with our expectations to trigger failures. 

I think the new tests will not be fragile, in the sense that they should
not falsely-fail if they're run on a very slow or a very fast machine.
(The old ones could.)

> I think it would
> make sense - and as we are working on this right now - to to simulate
> the external conditions that can cause grief, and run the hashcache
> tests against such a simulated system, where we can say 'stating this
> file will return a mtime with grantularity of 2 seconds' and 'the system
> time is now 1 second later' without needing *either* a FS with 2 second
> granularity, nor needing to wait for the system clock to actually move
> forward 2 seconds.

Yes, when the real world is too hard to test against using a simulator
is a good idea.

However any simulation is only as good as your understanding of the
situation, and if the bug is caused by misunderstanding how the system
behaves it won't catch it.  I would not have imagined the time could
apparently step backwards as it did here.

Of course once you realize that this can happen you can update the
simulation.

But on the other hand tracking down the bug by writing a test against
the real world, and then leaving (a cleaned up version of) that test in
the suite is good.

For now this bug is closed and in any case we expect to rework this code
in the future (folding it in with the working inventory).  Simulating
the filesystem might be good in the future.

So for now I propose to land this, treating simulation of the filesystem
as just a todo.

-- 
Martin