[rfc] list tests known to fail on a platform

Fri Sep 5 03:14:48 BST 2008

Hi Martin,

> There is a bit of a chicken and egg problem that it's hard to enforce
> the test suite passing if it doesn't currently pass, but it's hard to
> get everything passing if there is no automatic protection against new
> things being broken.  (Either because of actual portability bugs in
> Bazaar, or because of tests being added or changed to have platform
> dependencies.)

For my interest, how would the PQM enforce the test suite passes across
platforms?

> One way out of this would be to say that some set of tests are known
> to fail on particular platforms.  Much like KnownFailure at the
> moment, if they started unexpectedly passing we'd trap this and it'd
> be a clue to either investigate and perhaps mark them as expected to
> pass in future.  If we had that up to date at least it would give us a
> reference point to work from.

In general I think that is fairly reasonable: its unreasonable to assume
that all supported platforms always pass the entire test suite, so some way
of tracking further regressions for those platforms is valuable.

Regarding Windows, there are probably only a few core issues that are
causing problems.  I've got a patch in the queue for one of them (files
being in use when we try and remove them) and another cause of many failures
is LockContention errors.  This is apparently a well-known difference
between Windows and Linux wrt locks, so it makes sense to silence these one
way or another.

Once we cut these out, most of the other test failures should be fairly
shallow and most can be addressed on a case-by-case basis - eg, some tests
use a colon in filenames and this will never work on Windows - either the
colon needs to be removed, or the test needs to be skipped/xfaild on Windows
- but really there aren't that many of them.  Due to the nature of the bzr
test suite, a single "problem" can result in a large number of test failures
- so the number of errors isn't really a good respresentation of the number
of underlying problems.

In other words, with a little bit of a push I think we *can* get the Windows
test suite very close to passing (the best I've seen is about 50 errors and
50 failures, but most of them were LockContention problems)

> Of course there is some risk that we'd just be sweeping the parrot
> under the carpet and never come back to fix them.  There are at least
> some xfail tests at present that have sat for a long time, and some
> that are kind of miscategorized.  But at least it would let us get a
> red/green indication of whether the number is getting worse or better.

If a bug number was associated with and printed for each of the tests that
were supressed it might help keep things on the radar.  One or 2 tests
already do that - eg:

blackbox.test_info.TestInfo.test_info_locking     XFAIL
553ms
    OS locks are exclusive for different processes (Bug #174055)

Which interestingly enough looks like the exact same issue causing the
LockContention failures.

> particular platform.  If there are many (on Windows maybe?) this might
> be more manageable.

I think we can get the number of failures small enough on Windows to manage
individually...

Thanks!

Mark