Cmnts: [ACT][PATCH] UBUNTU: SAUCE: ubuntu_boot: use dmesg collected by autotest

Wed Aug 4 16:38:51 UTC 2021

Hey Sam,

I have some thoughts on this patch, and figured I would outline them 
here.  Some points you may agree with me, some you may not, regardless i 
think it's best to bring this up.

I usually am a pushover with these sort of changes, however, I don't 
like complex rules in tests that now gate the progression of a thing.  
In the case of changes to the ubuntu_boot test.

Any patch/update to this test is now affecting the progression of ALL 
kernels on 5 arch types. It may not seem like a big deal , but it is.

1.)  This patch proposes to change the log scanning of the ubuntu_boot 
test from /var/log/syslog to dmesg.  I don't see a problem with that, 
unless others with more experience in the kernel can comment here.

  1a. )Will the BUG:/Ooops:/kernel:/WARNING: messages all appear in 
dmesg in addition to syslog?

  1b.) Will this work on all archs?

I myself need to know that will still occur and on all the arch types, 
because we do not want those messages to slip through testing.

  1b.) My concern is that most of CKCT uses syslog. so should be not 
mirror the change there?

  1c.) Should we be scanning both? syslog & dmesg, in the event messages 
appear in one log versus another?

2.) I personally don't want to add customized rules in ubuntu_boot, for 
the sake of handling a corner case.  What is the corner case?   A 
manually provisioned system.

Any special handling should be done within CKCT, the ubuntu_test test 
should be as simple as possible, clear and concise. It should not need 
to know where it's being installed, if the installed system has been 
running for 50 of days or 5 hours.  It's purpose is to scan the logs and 
catch for any warnings mentioned in point #1a.

Everyone should keep the mindset that all of the tests are intended to 
run on a fresh clean system, In my opinion, the tests should be 
completely agnostic to that fact.

The rest is pretty much a rant and not pertinent to the patch itself, 
but i thought it best since on this topic to list them here.

3.) I personally don't think we should expand (customized scripts) such 
as the kernel_taint script from checkbox, to the ubuntu_boot test unless 
the test has been thoroughly tested on all -main kernels of all 
archs/series. That is not to much to ask, the assumption is the 
derivatives will also work properly if in fact the -main kernels pass. 
Not to mention it's just another thing to add onto the endless pile of 
work to monitor if/when the kernel_taint script gets updated.  We have 
already seen cases where the kernel_taint script fails on s390x, which 
means, it should be fixed, NOT HINTED. (That is my opinion only)  =)

4.) ubuntu_boot should not be hinted at all. You're gating the kernels 
on this. If there is a problem with ubuntu_boot, it *should* to be fixed.

5.) We can probably get rid of anything related to CENTOS, The kernel 
teams branch of Autotest and The Kernel teams branch of 
Autotest-Client-Test is all customized for Debian. Anything related to 
centos does not work. This is the reality of it. And that's really bad.

-Sean

On 8/3/21 7:30 AM, Po-Hsu Lin wrote:
> BugLink: https://bugs.launchpad.net/bugs/1937276
>
> Checking error from syslog works for freshly provisioned systems, but
> with the manually provisioned systems since the log is not guaranteed
> to be the boot log for the current session, it can be contaminated by
> other tests and trigger false-positives.
>
> Use dmesg collected by the autotest framework for this instead.
>
> Signed-off-by: Po-Hsu Lin <po-hsu.lin at canonical.com>
> ---
>   ubuntu_boot/ubuntu_boot.py | 21 +++++++++++++++------
>   1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/ubuntu_boot/ubuntu_boot.py b/ubuntu_boot/ubuntu_boot.py
> index 8782818f..7d7799b2 100644
> --- a/ubuntu_boot/ubuntu_boot.py
> +++ b/ubuntu_boot/ubuntu_boot.py
> @@ -14,15 +14,24 @@ class ubuntu_boot(test.test):
>   
>       def log_check(self):
>           '''Test for checking error patterns in log files'''
> -        '''Centos Specific Boot Test Checks'''
> -        os_dist = platform.linux_distribution()[0].split(' ')[0]
> +        '''Please run this on a freshly rebooted / provisioned system'''
>   
> -        # dmesg will be cleared out in autotest with dmesg -c before the test starts
> -        # Let's check for /var/log/syslog instead
> -        if os_dist == 'CentOS':
> -            logfile = '/var/log/messages'
> +        # dmesg before the test will be compressed and cleared with dmesg -c
> +        # the log will be stored in autotest/client/results/default/sysinfo/dmesg.gz
> +        dmesg_gz = os.path.join(os.environ['AUTODIR'], 'results/default/sysinfo/dmesg.gz')
> +        if os.path.exists(dmesg_gz):
> +            logfile = '/tmp/dmesg-ubuntu-boot'
> +            cmd = 'gzip -dk {} -c > {}'.format(dmesg_gz, logfile)
> +            utils.system(cmd, ignore_status=True)
>           else:
> +            # Fallback to syslog, which works for newly deployed node but not ideal for
> +            # manually provisioned SUTs as the content is not just for the current session
>               logfile = '/var/log/syslog'
> +            # Centos Specific Boot Test Checks
> +            os_dist = platform.linux_distribution()[0].split(' ')[0]
> +            if os_dist == 'CentOS':
> +                logfile = '/var/log/messages'
> +
>           patterns = [
>               'kernel: \[ *\d+\.\d+\] BUG:.*',
>               'kernel: \[ *\d+\.\d+\] Oops:.*',