Uniter tests for update-status hook - BLOCKER

Mon Jul 20 04:42:37 UTC 2015

Hi folks,

Earlier today I was investigating this CRITICAL BLOCKER bug:
	https://bugs.launchpad.net/juju-core/+bug/1475724

At first I thought that bug was referring to a different one, which I
fixed by skipping a part of a test that was using chmod to make unit
enter an error state. I filed a bug for the fixing of the skip:
	https://bugs.launchpad.net/juju-core/+bug/1476060

However on deeper looking into the first bug, it seems that it is all
timing related.

It appears from the outside that the uniter calls the 'update-status'
hook periodically when it enters an idle state. It also appears to be
very timing related.  The tests are failing intermittently on ppc64el
and windows [1], almost certainly due to differences in timing on the
different platforms.

I propose that someone who understand the update-status work does some
serious rework to the uniter tests.  We should ideally be hiding the
update-status hook.

The failure from here:
	http://reports.vapour.ws/releases/2894/job/run-unit-tests-win2012-amd64/attempt/916

is this:

ctx.waitHooks     : []string{"install", "leader-elected",
"config-changed", "start", "update-status", "update-status",
"config-changed", "update-status"}

ctx.hooksCompleted: []string{"install", "leader-elected",
"config-changed", "start", "update-status", "config-changed",
"update-status"}

startupHooks(false) returns
  []string{"install", "leader-elected", "config-changed", "start",
"update-status"}
really it shouldn't care about update-status at all.

The test then expects the following two hook calls:
  waitHooks{"update-status", "config-changed"}

But as you can see above, there was not a second update-status before
the config-changed (but there is one after on both).

Really the test doesn't care one hoot about the update-status hook, all
it really cared about checking was the config-changed. [2]

We shouldn't be asserting things we don't care about.

Here are my thoughts as to what should be changed:

 - most tests should not care about the update-status hook, and it
shouldn't appear in any of the lists of hooks
 - the matchHooks method on the context should skip over the
update-status hook calls
 - explicit calls should be made if the user really cares about
update-status being at the end of the list, i.e. don't use matchHooks,
but add another method to look for the update-status

This should make the uniter tests not be so timing dependent.

Aside from all this work, it is becoming REALLY IMPORTANT that we stop
writing terrible, wasteful, full integration type tests when what we
really care about testing is some aspect of uniter internals. I know
that it is just simpler to copy what is there and make more, but it is
better to write smaller, targeted tests that just test what you are
wanting to assert.

Yes there should be integration level tests, but not all the unit level
tests should be tested at an integration level.

We need to not only stop writing more of these, but go back and fix the
ones that are there to be better.

Tim

[1] We should seriously start thinking how to gate landings on the unit
tests passing on amd64, ppc, and windwos.

[2] Most of the tests in uniter_test.go should be unit tests not
functional tests.