[rfc] add a TestFactory class or concept

Martin Pool mbp at canonical.com
Mon Aug 10 06:28:23 BST 2009

I was originally thinking about this in the context of Haskell
quickcheck, which makes a very strong point of excluding the sample
data from the test property definition.  I'm not at all sure that
would be useful in our case or in general, but it got me thinking.

2009/8/7 Aaron Bentley <aaron at aaronbentley.com>:
> Hash: SHA1
> Martin Pool wrote:
>> Lots of tests are run on some specific values; for instance they make
>> a branch and then do some operations on it.
> We need specific values in order to make assertions about them, and to
> provide legibility.  Could you please explain what about this is bad?

I will try.

It's not quite true that you need specific values in order to make
assertions about them: a very common case is that we commit a revision
and then check that its id is retrieved later, without ever hardcoding
the revision id.  Another case is that all tests run in a temporary
directory and they commonly make assertions about the contents of the
filesystem even though the precise path they will have is determined
at run time.

You can make assertions that some output is related to some input,
whether that input is provided by the test environment or returned
from an earlier call into the code under test.

I certainly acknowledge that the desired result is often easier to
give for one specific example than as a general form.  For instance
(and see the earlier thread about easier testing) for strings shown to
the user it's easier to feel confident they're correct if you have an
example of the literal string being shown, than a format string that
may or may not make sense.  And sometimes writing the general property
would be nearly as hard or complex as writing the code-under-test,
whereas hand-checking it for one case is easy.

Problems with hardcoding particular values include: the code may be
wrong for other particular values; it may be longer than it needs to
be; it's hard to update it to test other particular values or for
other changes; it tends to be repetitive/redundant.

One case where it's harder to update: the info tests repeat the
representation of bzr's default repository format many times, but what
they actually want to assert in most cases is that the output contains
an appropriate format name.

I'm suggesting that we don't always get this code right in either direction.

>> I was talking to jml the other day and he said that Launchpad have a
>> cleaner separation of all of this setup into a TestFactory class
> Cleaner or not, I find bzr's approach much easier to use.

.. so I'd like to work out what's wrong with what Launchpad does too,
so as not to repeat it, if there's anything else you can think of.

>> (iirc) - if you want a Person to test, you would always get that from
>> the factory rather than putting arbitrary example data into the class.
> I don't understand.  There are plenty of examples where we create Person
> objects with specific email addresses and names.  Look at this method
> signature:
>    def makePersonNoCommit(
>        self, email=None, name=None, password=None,
>        email_address_status=None, hide_email_addresses=False,
>        displayname=None, time_zone=None, latitude=None, longitude=None):
> These are all the things we've felt a need to specify in one test case
> or another.
> It's true that we don't have to specify anything at all to makePerson.
> If you were arguing that we should improve our existing helpers, I could
> understand that, but I don't see how it relates to a TestFactory concept.

So, what is the different between our make* methods and a 'TestFactory
concept'?  I'm not sure.  To me they're similar things; I used the
name to suggest giving it more prominence or standardization.

Generally speaking we add make_* methods on various test classes,
rather than centralizing them.  Many of them are things that could be
more broadly reused.

>> There are some tools such as BranchBuilder to do this.  There are also
>> some utilities on TestCase and similar to provide dependencies, and to
>> integrate with test parameterization - for example
>> TestCase.get_transport() and the implicitly provided working
>> directory.  On the other hand a lot of tests hardcode some test data.
> I find that concrete examples are easier to understand when you get test
> failures.  For example, if you get an exception containing a branch
> path, and the branch path includes the test case variable name, it's
> much easier to see what's going on.

That is true.

> I've never used BranchBuilder, for
> example, and hardly ever used MemoryTree.

I think this is an interesting loose thread to pull upon - just
exactly the kind of thing I was thinking about.  We have classes that
are meant to make tests faster to run and easier to write, but they're
not used consistently and you hardly use them at all.  So there's
clearly something wrong here.  Possibly we should make those classes
better, possibly we should give up on the whole idea.

Martin <http://launchpad.net/~mbp/>

More information about the bazaar mailing list