Release testing and the relationship between 'bzr selftest' and plugins

Thu Mar 15 17:13:20 UTC 2012

Am 15/03/12 16:11, schrieb Vincent Ladeuil:
>>>>>> Jelmer Vernooij <jelmer at samba.org> writes:
>
>
>     >  * plugins can have dependencies - would we start shipping the svn, apr
>     > and mercurial sourcecode with bzr?
>     >  * some plugins have a different landing mechanism than bzr.dev;
>     > requiring review, for example
>
> I'd say soft dependencies in bzr-core and build dependencies for
> packages or is should that be recommendations instead ?
We have to bundle the build dependencies, otherwise the plugins can't be
used and thus bundling them becomes pointless (since e.g. PQM will skip
all the tests).

If we require e.g. PQM to have the dependencies pre-installed then we
end up with another problem, which is ensuring that the right build
dependencies are installed.

>     >> 2 - push plugin authors to create series targeted at bzr releases: avoid
>     >> many maintenance issues :) This will also help installer
>     >> builders/packagers.
>     > For most plugins, this doesn't scale with the number of release series
>     > and the size of the plugins. It isn't worth the effort to maintain
>     > separate release series if it's trivial to be compatible with more
>     > versions of bzr.
>
> Balance to be found again, some plugins may just want to tag specific
> revisions for a given series if they don't evolve a lot between series.
>
>     > For plugins that are tightly coupled with particular bzr versions,
>     > like the foreign branch plugins, this is an option. But it still
>     > wouldn't have prevented the problems we had with the 2.5
>     > installers. Changes between beta 4 and beta 5 broke the foreign
>     > branch plugins, and the installers shipped with an outdated
>     > version of those plugins (from the correct release series).
>
> Sure, but at least the packagers can subscribe to the tip of a given
> branch and be done.
In practice neither of these seem to happen though.

>
>
>     > We don't have the bundle the plugins to make sure that various
>     > bits of our infrastructure run selftest with the plugins. Neither
>     > does bundling the plugins guarantee that developers won't start
>     > disabling some plugins that slow down their test runs.
>
> Hence we need a CI system but as mentioned, a CI system has high
> requirements: failing tests should be dealt with asap before the S/N
> ratio drops.
+1
>
>     >> <snip/>
>     >> 
>     >> > Once we fix the previous issue, I'm sure more developers will
>     >> > start running more of the tests. Perhaps it would also be possible
>     >> > to have a babune slave run the tests for all plugin trunks against
>     >> > bzr.dev?
>     >> 
>     >> It's on babune's TODO list for quite a long time but doesn't make sense
>     >> until we get back to a point where all core tests are passing.
>     >> 
>     >> That's another vicious circle: a CI system is valuable only when 100% of
>     >> the tests are passing. As soon as you start having even a single
>     >> spurious failure, the S/N ratio goes down and there is no point adding
>     >> more tests (or rather expect much value out of the CI system, adding
>     >> tests in itself can't be bad, can it ? ;).
>     >> 
>     >> One way to mitigate that would be to define and maintain different test
>     >> suites that we can mix and match differently to suit our needs:
>     >> 
>     >> - a critical one for pqm, no exception accepted,
>     >> 
>     >> - a less critical one for babune: excluding known spurious failures to
>     >> at least get to a point where babune can be rely upon
>
>     > Can't we perhaps just be more pro-active about spurious failures?
>
> As in tackling https://bugs.launchpad.net/bzr/+bugs?field.tag=babune and
> https://bugs.launchpad.net/bzr/+bugs?field.tag=selftest you mean ?
Hah, thanks! Didn't realize you had filed those. We should also address
the tests by fixing them or disabling them (and opening bugs).
>     > I think we should either fix or disable tests (and file bugs) with
>     > spurious failures rather than keeping them enabled and stumbling
>     > over them constantly.
>
>     > Tests that flap aren't useful for either PQM or CI, I don't think we
>     > should treat them differently.
>
> Right, we had enough of them to decorate them may be ? I did exclude
> tests on babune at one point but this is not a good solution as I forgot
> about them at one point so we need some in-core tracking to get a better
> visibility.
>
> Probably something along the lines of re-trying once and warns if it
> fail twice but don't let selftest itself fail and emit a final summary
> mentioning the number of such spurious failures.
Doing this just for tests that are known to be spurious you mean, or for
all tests ? I wouldn't be in favor of the latter.
>     >> > And we're less likely to find issues at install time if the full
>     >> > testsuite is already being run regularly.  Of course, it will slow
>     >> > down the release process somewhat, having to wait for the full
>     >> > testsuite for bzr core and all plugins to pass and all.
>     >> 
>     >> Release time is not the right time to run heavy testing, this is
>     >> precisely what CI and time-based releases are targeting: cutting a
>     >> release should be just:
>     >> 
>     >> - check that tests have been passing lately,
>     >> - check that no critical issues are pending,
>     >> - tidy up the news,
>     >> - cut the tarball.
>     >> 
>     >> I.e. only administrative stuff, no last-minute rush for landing, no bug
>     >> fixes, no source changes :) The rationale is that any change requires
>     >> testing (which takes time) *and* can fail which delays the release. This
>     >> goes against time-based releases and as such should be avoided as much
>     >> as possible (common sense should be applied for exceptions as usual).
>     >> 
>     >> I'd go as far as saying that if we need to change the release process it
>     >> should be by *removing* tasks, never adding new ones.
>     > I'm only saying there should be a final "bzr selftest" run to verify
>     > everything is ok, not that this is a point to find and fix all
>     > compatibility issues. If we have proper CI and run "bzr selftest" with
>     > plugins regularly, then this will almost certainly pass. But a last
>     > check like this will prevent brown paper bag releases of the installers,
>     > as we had for 2.5.0. And that costs even more RM time.
>
> Indeed.
>
> So, it that wasn't clear, let me re-iterate: I'm in full agreement
> about:
>
> - spending more time on ensuring that the full test suite is always
>   passing,
>
> - tweaking the 'full test suite' definition so it matches what we really
>   care about (this means tagging spurious failures in a way that ensure
>   that they are addressed, adding whatever plugins we think are worth
>   the maintenance effort and <other ideas>)
>
> I think we agree far more than we disagree on most of the topics so
> let's address the ones we agree on ;)
Works for me ! :-) So, let's:

 * Run "bzr selftest" and file bugs for issues we encounter
 * Fix said bugs
 * Run "bzr selftest" while we sleep
 * Run "bzr selftest" during lunch break
 * Run "bzr selftest" in the shower
 * ...
 * Can we run "bzr selftest" with a set of passing plugins installed on
babune? We can start with just one and add more as we verify they pass
the testsuite

Cheers,

Jelmer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/bazaar/attachments/20120315/2ed531d3/attachment-0001.pgp>