Wrote a small black-box test suite for bazaar

Wed Aug 27 16:19:53 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>  I don't follow what you mean with "bzr (say)". You seem to be suggesting
> that you can give precise performance figures for a test that runs an
> arbitrary program performing an arbitary operation. Are these actual
> benchmark numbers of some sort? Can you provide more details if so?
> 
> Running  "while true ; do date ; done | uniq -c" in bash is a recognised way
> to test the performance of fork(), found by googling a bit. This produces
> around 600 forks per second on my (ancient Pentium 4) linux machine. If I
> print the date instead using "python -c 'import time; print time.asctime()'"
> I still get 50 forks a second. Either way, I don't come close to being as
> slow as 5 per second. With virtualisation I suppose it might be slower but 5
> a second seems extreme.

Try instead:

python -c 'from bzrlib import branch, repository; import time; print
time.asctime()'

The overhead for spawning bzr is generally "import bzrlib" time, because
python is actually quite slow at re-importing all of the data structures into
memory.

"time bzr rocks" which is a mostly "no-op" command is about 100ms on this
machine. Which is about 10 per second.

...
> 
>  <pasted from original mail - did you not see this or not understand it?>
> 
> The main points:
> 1) You don't need to know the code to write tests (I've never looked at the
> code)

Well you did at least enough to compare to the blackbox tests.

> 2) Tests don't depend on the structure of the code and hence don't need
> changing when the code is refactored.

With backwards compatibility, this is actually quite rare.

> 3) There are already quite a few blackbox tests that look like
> 
> def test_update_standalone_trivial(self):
>         self.make_branch_and_tree('.')
>         out, err = self.run_bzr('update')
>         self.assertEqual('Tree is up to date at revision 0.\n', err)
>         self.assertEqual('', out)
> 
> This is basically a way to write tests of that form without writing any
> code.
> 
> Regards,
> Geoff Bache
> 

You are welcome to do this, and if you find it useful and worthwhile, we'll
probably take a look at it. I would expect it to end up being quite slow in
the long run.

bzr selftest -s bzrlib.tests.blackbox --list

shows 844 blackbox tests. If each one of those has a single spawn, you are
adding at least 8s to the test in just spawn overhead. But as you are also
planning on creating new branches and working trees, populating them,
committing the changes, etc. I wouldn't be surprised if there was 10 spawns
per test. Or around 80s added. Right now running all blackbox tests takes
about 80s. So your proposal is likely to at least double the time it takes.

Actually, if you really wanted to test the overhead of just running a real bzr
for the analysis step, you should be able to do something like:

TestCase.run_bzr = TestCase.run_bzr_subprocess

That won't end up setting up the tests using a subprocess, but it will cause a
full spawn for running the command under test.

In the end, we started off with a test suite which spawned bzr for everything.
Init/add/commit/etc. And getting rid of that saved a *lot* of time. It also
simplifies the tests, so that they can focus on one aspect, rather than having
lots of combined effects. Which is a good sign for unit tests.

I think it comes down to "bzr" is not a simple fork, but starting up the
python VM and importing the python libraries that we use. It is an unfortunate
truth that:

  'time bzr rocks'

is vastly slower than

  'time date'

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFItXCZJdeBCYSNAAMRAu3jAJ9++gxd89cneVO0AH3V2MLZR12TZACdFaVp
z+5MLF2OjHz6eA1I1TLlQsg=
=2cV3
-----END PGP SIGNATURE-----