Daily builds are now back to green, need better test output

Wed Nov 6 11:22:49 UTC 2013

On 6 November 2013 10:01, James Hunt <james.hunt at canonical.com> wrote:
> Hi Dmitrijs,
>
> On 06/11/13 04:30, Dmitrijs Ledkovs wrote:> I've committed a few fixes to fix
> test-suite errors on PPAs:
>>
>> * set & use XDG_RUNTIME_DIR, if not set as otherwise spawning session
>> init was failing
>>
>> * which uncovered that test_umask failure, too restrictive umask under
>> test (remember session-init needs to create subdirectories)
>>
>> * and disabled timing tests, since those don't work on virtualised builders
>>
>> So at the moment this brings all daily builds back to green. Let's try
>> to keep those up =)
> Thanks! Agreed, but let's not forget that *all* the tests are "green" prior to
> any push to lp:upstart (having passed locally). But as we know, new tests
> sometimes fail in the slightly odd lp build environment. Maybe using
> sbuild-launchpad-chroot would help a little (although that still won't give an
> identical environment AIUI).
>

Yeah, true. E.g. here it was mostly the case that sbuild doesn't have
"user centric post-Precise variables" such as XDG_RUNTIME_DIR and the
lack of any recent kernel on virtualised PPAs (long standing known
problem, to be resolved)

>>
>> WTR declaring failing tests, skipped, expected fail, and bailing out.
>>
>> At the moment, we mostly abort and/or skip executing tests
>> with/without telling the user about it. I think it would be best to
>> implement TAP[1] output for our tests, such that we can properly skip
>> tests, or mark them as expecting to fail.
>>
>> Announcing that "some tests may or may not fail, because one has
>> overlayfs mountpoint" is not helpful. Saying that "test_conf test
>> #3-7" are expected to fail with a reason due to overlayfs mountpoint,
>> is much better.
>>
>> In addition it should enable us to write tests, which are known to
>> fail at the moment and mark them as TODO for future.
>>
>> Similarly this will improve ability to report and compare test-suite
>> results on different platforms (e.g. how many tests are not run in
>> virtualised PPAs vs nonvirt vs overlayfs).
>>
>>
>> [1] http://en.wikipedia.org/wiki/Test_Anything_Protocol
>>
> Take a close look at https://bugs.launchpad.net/libnih/+bug/933717. I think we
> should resurrect this and see if Scott would be happy to merge it. I did discuss
> it a long time ago with Scott and I think his concern was retaining the exiting
> API. If you look at the attachment, you'll see I've proposed defining
> NIH_TEST_VERSION_2 to enable the new functionality, but there are other options
> we could explore.
>

Ah, interesting. I'll have a look. I've started by writting an
alternative nih/test_output_tap.h which is decided for inclusion based
on a define.

I wanted to keep both output formats, and keep the new one backwards compatible.

> With minimal changes to the existing Upstart tests (basically we just need to
> run sed over the code) we can do everything you suggest and a whole lot more.

Will take a look at that one.

> That, plus fixing automake to actually show verbose logs by default
> ('serial-tests' or add some logic to display the verbose log at completion time).
>

Well, I've cheated:
In debian/rules packaging I'm now catching error from $ make check and
cat "test-suite.log".
See for example:
https://launchpadlibrarian.net/155928411/buildlog_ubuntu-saucy-amd64.upstart_1.10%2Bdaily%2B1555%2B1520%2B201311060249~ubuntu13.10.1_FAILEDTOBUILD.txt.gz

Not ideal, but at least something. And with TAP we will gain _better_
test-suite summary.

Regards,

Dmitrijs.