[MERGE] Speedup selftest by loading a previously saved test list
v.ladeuil+lp at free.fr
Thu Jan 17 16:59:51 GMT 2008
>>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
robert> On Wed, 2008-01-16 at 15:55 +0100, Vincent Ladeuil wrote:
>> >>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
robert> On Fri, 2008-01-11 at 00:16 +0100, Vincent Ladeuil wrote:
>> >> This patch add two options to the selftest command:
>> >> --save-list <file>
>> >> --load-list <file>
>> >> You use the first one to define a set of tests you want to run
>> >> repeatedly and use the second one to avoid loading the whole test
>> >> suite:
>> >> - test inheriting from a class defined inside a function (or
>> >> using a lambda as an attribute) can't be pickled either (but
>> >> see the patch for some obvious fixes in the bzr core tests),
>> >> But overall I think more than 99% of tests can be used with this
>> >> mechanism.
robert> I feel pretty uncomfortable about this.
>> What I've tested so far: 99.76% of our core test suite are
>> serialized reliably (10530 out of 10561 tests) see below.
robert> You have to assume a great deal to be sure that tests
robert> will execute the correct test when you are pickling
robert> their state.
>> Serialized before setUp is run. I assume that each test is
>> initialized by setUp and setUp only regarding its state.
robert> Depending on what you mean by initialized this can be
robert> very false
AIUI, you and Andrew consider that pickle is not the right way to
address the problem.
You know what ? I agree :)
I chose pickle because it allowed *a* solution to the problem. I
don't consider that approach to be perfect. I never have. I had
even some big doubts about it at first, but I gave it a try.
It first failed. But with a few tweaks it worked for nearly the
whole test suite and only some tests cannot be handled but with a
decent way to detect them.
And all the tweaks raised interesting points (the lambda were
useless, the DocTest showed that pickle was a dead end for them,
the BzrDirFormat (well, see below), etc).
robert> in some tests using the pyunit test protocol, which
robert> we try to adhere to for interoperability with other
robert> python libraries (so that bzrlib tests are friendly
robert> to other runners (when bzrlib is embedded), and vice
robert> verca (so that plugins that use features from other
robert> projects can mix in their test classes safely)).
Alright, alright, that still doesn't show me a failing test, but
I can easily imagine that there are some outside of the bzr test
<BzrDirFormat rough (rude ?) patch snipped/>
I'd rather not focus on that since I take inspiration from
RemoteBzrDirFormat and BzrDirMetaFormat1 to define it. The point
was to make the tests pass instead of showing a selftest
execution ending with a big 'FAILED' :) There are other ways to
address the problem, this one was the simplest (but I don't
pretend it is correct).
So I let someone more knowledgeable than me in that area
understand why some formats define an __eq__ while some others
For those interested, the failing test is
This test is parametrized with (among others) a bzrdir_format
*object* into 5 instances 2 of them failing when reloaded
(roughly because we use a registry which creates objects at
import time, said objects when deserialized becoming different
- BzrDirFormat4 skips the test by returning early (TestNotApplicable ?),
- BzrDirFormat5: fails
- BzrDirFormat6: fails
- BzrDirMetaFormat1: succeeds because it defines __eq__
- RemoteBzrDirFormat: succeeds because it defines __eq__
robert> Its undeniably faster. But there is a raft of
robert> complications not address at all by your patch, and
robert> which addressing will add complexity that is at best
robert> confusing to the test environment.
Ok. Let's build on what we agree upon then. I tried to isolate
pickle related problems at save time, effective but far from
robert> I'll change my vote on a discussion about
robert> /correctness and simplicity for developers/ here, not
robert> about speed.
I get that. The main point of that patch was to evaluate if the
speed gains were worth it. If we agree on that, let's find
another way to address the parameter handling (parameter-less
tests should be easy enough to create from name only).
But let's keep in mind that a saved test list can become obsolete
pretty quickly and is not aimed at replacing using selftest the
"normal" way. So I don't want to spend too much effort to be 100%
correct and robust.
robert> pickling and state skew is an insidious problem when
robert> you are dealing with carefully managed data, let
robert> alone tests.
Full agreement here too, it just happened that the tests were
easier to pickle than I expected so I wanted to share the fun ;-)
Let's go back to the problem.
TestCase doc strings says:
| If it is necessary to override the __init__ method, the
| base class __init__ method must always be called. It is
| important that subclasses should not change the signature
| of their __init__ method, since instances of the classes
| are instantiated automatically by parts of the framework in
| order to be run.
That means we can't use __init__ to provide parameters.
| Test authors should subclass TestCase for their own
| tests. Construction and deconstruction of the test's
| environment ('fixture') can be implemented by overriding
| the 'setUp' and 'tearDown' methods respectively.
For test parametrization, what we do is inject the parameters as
TestCase attributes after __init__ but before setUp.
I respected that when reloading the tests and I think the
approach is sound. I'd felt uncomfortable modifying setUp with
I also feel uncomfortable in adding parameters to setUp itself.
I think the aim is to define a specific loading so we should keep
loading and running the test suite separate. setUp is part of
running the test suite so it should not be modified.
>>>>> "Andrew" == Andrew Bennetts <andrew at canonical.com> writes:
Andrew> If there is a benefit to persisting the results of
Andrew> test discovery, I think it'd be better to be
Andrew> persisting the names rather than the instances.
Sure. Let's find a better than pickle way to do that then.
Andrew> If we can get to the situation where given a test
Andrew> name, we can reconstruct it just as cheaply as
Andrew> constructing the TestCase class directly, that would
Andrew> be ideal. And at least as fast as unpickling, if not
Andrew> faster, and definitely much more robust.
No problem as long as no parameter are involved.
Andrew> Perhaps what we should do is change
Andrew> bzrlib.tests.TestCase's id to be easily parseable
Andrew> into four parts: test module name, test case name,
Andrew> test method name, parameterisation pieces.
Yup. 'id' is a good starting point.
return "%s.%s" % (_strclass(self.__class__), self._testMethodName)
return "%s.%s" % (cls.__module__, cls.__name__)
Andrew> I suppose we could already guess this from
Andrew> pretty reliably if we wanted. We could then write a
Andrew> loader that takes a list of such names and constructs
Andrew> just those tests, without loading unnecessary modules
Andrew> and without constructing many unnecessary instances.
Yes a simple:
mod, klass, meth = smartsplit(name)
if not loaded(mod):
Andrew> (I'd expect the parameterisation + module-level
Andrew> test_suite methods to be the biggest problem here,
I avoided that, and I think we still can avoid it by ignoring the
test suite 'shape', simplifying it as a test list and handle the
parameters for each test. At least nothing in my experiments went
against that choice.
Andrew> but I think for a first cut simply building the
Andrew> entire test_suite then filtering would be ok.
You mean the module test suite here I assume.
Andrew> It only takes 500ms on my laptop to construct the
Andrew> entire repository_implementations test_suite(), which
Andrew> has 2090 TestCases.)
I have some doubts about that approach because I suspect it will
be more intrusive that adding capabilities (or visibility on
parameters) directly to the TestCase classes (or the associated
The problem is to find the right balance between keeping the
parametrization sufficiently opened while still being able to
build the TestCase from the parameters. pickle allowed me to
postpone these choices while implementing the rest.
I think a promising approach is to save the test name, its
parameters and the associated adapter class name.
Then the adapter should be modified to always create the tests
from the parameter textual representations (as used in the
The impact on test writers should be kept minimal that way and
give a better tracking on how one should parametrize tests and
how to find back how a failing test have been parametrized.
The needed data furiously looks like those available in the
self.scenarios attribute of the tests.TestScenarioApplier object
with the additional constraint that the first element of the
tuple should be unique, but I suspect it's already the case.
I will give that a try.
More information about the bazaar