[rfc] add a TestFactory class or concept

Thu Aug 13 23:40:36 BST 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
>> Martin Pool wrote:
>>> Lots of tests are run on some specific values; for instance they make
>>> a branch and then do some operations on it.
>> We need specific values in order to make assertions about them, and to
>> provide legibility.  Could you please explain what about this is bad?
> 
> I will try.
> 
> It's not quite true that you need specific values in order to make
> assertions about them: a very common case is that we commit a revision
> and then check that its id is retrieved later, without ever hardcoding
> the revision id.

I think this is a good example, because it gets at a lot of the issues.

I don't think we're making assertions about specific values in that
case.  In an example where we were making assertions about specific
values, we would specify a revision-id, and then later test that the
literal revision-id had been used.

So if you'd like to argue instead that we rarely need to make assertions
about specific values, I'd certainly agree it's not a need.

We can test almost everything by allowing the the revision-id to be
auto-generated and storing it in a variable.  (We can't test that
specifying a revision-id works, of course.)  But I find that the
indirection through that variable name makes the test more abstract,
less concrete, and harder to follow.

> Another case is that all tests run in a temporary
> directory and they commonly make assertions about the contents of the
> filesystem even though the precise path they will have is determined
> at run time.

This is some kind of middle ground between specified and unspecified.
Semi-specified?  We create a scenario with values that are specified
relative to the cwd.  Then we test the outcome with the values we had
specified earlier, relative to the cwd.  I agree that we do not specify
full paths, but the tests would fail if our specified relative paths
were ignored.

> Problems with hardcoding particular values include: the code may be
> wrong for other particular values; it may be longer than it needs to
> be; it's hard to update it to test other particular values or for
> other changes; it tends to be repetitive/redundant.
> 
> One case where it's harder to update: the info tests repeat the
> representation of bzr's default repository format many times, but what
> they actually want to assert in most cases is that the output contains
> an appropriate format name.
> 
> I'm suggesting that we don't always get this code right in either direction.
> 
>>> I was talking to jml the other day and he said that Launchpad have a
>>> cleaner separation of all of this setup into a TestFactory class
>> Cleaner or not, I find bzr's approach much easier to use.
> 
> .. so I'd like to work out what's wrong with what Launchpad does too,
> so as not to repeat it, if there's anything else you can think of.

Separating the test factory class from the test case means more typing
for two reasons:

1. The name of the factory object must be specified.

When writing tests for Launchpad, I usually subclass the
TestCaseWithFactory, which has a member named "factory".  So instead of
typing "self.makePerson", I must type "self.factory.makePerson".

2. Cleanup must be manually specified.

Many test fixtures need to be cleaned up-- they need locks released,
etc.  A method of TestCaseWithTransport can arrange for cleanup using
addCleanup, but a method of a test factory like Launchpad's cannot use
this facility.  This means that clients must manually specify cleanup.

We could alter the design so that the factory was initialized with a
cleanup facility.  We could alter the design so that the factory also
provided cleanup, and TestCase.tearDown would call it.  (Though I'd
worry about doing things out of order then.)

>>> (iirc) - if you want a Person to test, you would always get that from
>>> the factory rather than putting arbitrary example data into the class.
>> I don't understand.  There are plenty of examples where we create Person
>> objects with specific email addresses and names.  Look at this method
>> signature:
>>
>>    def makePersonNoCommit(
>>        self, email=None, name=None, password=None,
>>        email_address_status=None, hide_email_addresses=False,
>>        displayname=None, time_zone=None, latitude=None, longitude=None):
>>
>> These are all the things we've felt a need to specify in one test case
>> or another.
>>
>> It's true that we don't have to specify anything at all to makePerson.
>> If you were arguing that we should improve our existing helpers, I could
>> understand that, but I don't see how it relates to a TestFactory concept.
> 
> So, what is the different between our make* methods and a 'TestFactory
> concept'?  I'm not sure.  To me they're similar things; I used the
> name to suggest giving it more prominence or standardization.

That didn't seem to be your thrust at all.  I'm sorry if I
misunderstood, but I was reacting to the idea of adding a TestFactory
object, like Launchpad's, to bzr.  If that's not what you meant, then my
response must seem quite incoherent.

> Generally speaking we add make_* methods on various test classes,
> rather than centralizing them.  Many of them are things that could be
> more broadly reused.

This wasn't a problem I had noticed.  We have make_branch and
make_branch_and_tree, and I thought you were proposing making them
harder to use.  If instead, you're saying "make_branch is one honking
great idea, let's do more of them", then I'm 100% in favour.

>> I've never used BranchBuilder, for
>> example, and hardly ever used MemoryTree.
> 
> I think this is an interesting loose thread to pull upon - just
> exactly the kind of thing I was thinking about.  We have classes that
> are meant to make tests faster to run and easier to write, but they're
> not used consistently and you hardly use them at all.  So there's
> clearly something wrong here.  Possibly we should make those classes
> better, possibly we should give up on the whole idea.

In general, I don't see the advantages in speed being worth the cost in
simplicity.

MemoryTree is not close enough to a real working tree for me to be
comfortable with it.  I don't trust it to do what I want, and I don't
trust it to be a good enough emulation of a working tree.  I'd rather
use a PreviewTree, since I know and trust the TreeTransform interface.

BranchBuilder-- well, no compelling case has been given to me for using
it.  It wouldn't be appropriate for me to comment further until I
understand what it's meant for.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkqElmAACgkQ0F+nu1YWqI1pzACfW0iP8FhZYWPwm3AYwuw+Qs/t
/UIAnjqL9mSv3IrjK2/Ru7Lglj6EvwA+
=dvby
-----END PGP SIGNATURE-----