Reducing Regression Test suite: LTP

Thu Oct 14 17:10:37 UTC 2021

On Wed, Jul 14, 2021 at 11:26 PM Krzysztof Kozlowski
<krzysztof.kozlowski at canonical.com> wrote:
>
> On 13/07/2021 13:53, Po-Hsu Lin wrote:
> > On Tue, Jul 13, 2021 at 7:37 PM Thadeu Lima de Souza Cascardo
> > <cascardo at canonical.com> wrote:
> >>
> >> On Tue, Jul 13, 2021 at 01:03:42PM +0200, Krzysztof Kozlowski wrote:
> >>> On 13/07/2021 12:50, Thadeu Lima de Souza Cascardo wrote:
> >>>> On Tue, Jul 13, 2021 at 12:18:09PM +0200, Krzysztof Kozlowski wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> We talked about possibility of reducing our regression test suite. I
> >>>>> have a candidate for this - LTP (Linux Test Project).
> >>>>>
> >>>>> Each run of full LTP takes around 4 hours (2 - 2.5h for ubuntu_ltp, 40
> >>>>> minutes for ubuntu_ltp_stable and ~1h for ubuntu_ltp_syscalls). I looked
> >>>>> at cloud instances (4 and 48 cores).
> >>>>>
> >>>>> LTP tests everything: known kernel bugs and CVEs, kernel syscalls and
> >>>>> user-space interfaces, network and probably more. It is a huge test suite.
> >>>>>
> >>>>
> >>>> Which is why it is so valuable. It not only tests that kernels interfaces are
> >>>> behaving as expected, but also exercises them, preventing us from missing an
> >>>> important regression.
> >>>
> >>> Thanks for the comments. All of the regression tests are valuable. Not
> >>> only LTP. However with this approach we might never reduce them...
> >>>
> >>
> >> I think that reasoning might work for a test for xfs_tests, for example, where
> >> we should not be carrying filesystems changes on our derivatives, save for
> >> CIFS.
> >>
> >> Still, we have observed some cloud tests pointed out odd behavior on btrfs, due
> >> to a different number of CPUs.
> >>
> >> But if we keep reducing our tests on the derivatives under that argument, we
> >> might end up not testing much more than boot. And amongst the tests we run, I
> >> find at least ubuntu_ltp_syscalls one that we should be running over all our
> >> kernels.
> >>
> >> Perhaps, I am too biased torwards ubuntu_ltp_syscalls, and have not looked much
> >> at the tests that we run under ubuntu_ltp_stable. And still, I am pretty sure I
> >> would be able to make an excuse for each of the flakey tests there. Even though
> >> some of them would be: the test is broken, should be fixed. ENOTENOUGHTIME.
> >>
> > If the concern here is ubunut_ltp / ubuntu_ltp_stable tests is taking
> > too long to run on one instance, another solution is to break them
> > down like what we did for syscalls. We can take test like controllers,
> > dio which will take up to 1 hour to run into a new test suite like
> > "ubuntu_ltp_controllers" in ACT.
>
> This could help because it would allow to re-run smaller subset of tests
> on some failures and see the results faster. But duration of test is not
> the only problem.
>
> For example several LTP controller tests are:
> 1. Outdated because they were tuned for older kernel where cgroups was
> different. Only minor updates (fixups) were happening to these tests
> recently - to make them working on newer kernels but no one, I think,
> re-did them with new kernels. And they should be re-done because kernel
> internals changed a lot since then.
> For example in several places memcg tests assume hierarchical groups can
> be turned on/off. Since kernel v5.11 you cannot disable hierarchical
> mode. It is fixed. Tests were kind of tweaked to handle this but half of
> them are now confusing or bail out early.
>
> 2. So specific or tight that they fail on any different conditions than
> author's intended. I spent here few days to fixup memcg tests because
> they assumed kernel memory is not accounted per group (not true since
> v5.9) and any process memory allocation or subgroup management does not
> create side-effects (e.g. memcg_use_hierarchy_test.sh sets limit of 1
> page but on two node machine "mkdir subgroup" causes allocation of 100
> pages of kmem!). See bottom of:
> https://lists.linux.it/pipermail/ltp/2021-July/023803.html
>

Hello,
I am here to resurrecting this thread to gather some input for our
recent LTP test changes:

1. We're now using our own fork: https://kernel.ubuntu.com/git/ubuntu/ltp.git
The idea is to take pending review patches as local SAUCE patches,
which we don't have any for the moment (except my update notes) as all
of them are accepted upstream \o/
For future updates, I am not sure if we should follow their release
policy - it appears to be once for every 4 months. Or if we can do the
update for maybe every 2 or 3 SRU cycles?

2. We have the controllers subset separated from ubuntu_ltp as
ubuntu_ltp_controllers
This makes it easier to hint, and a bit more "browser friendly" as the
report is not *that* lengthy when running almost everything together.
And with Krzysztof's hard work, this test is not that stink as it was.
If no objections, my plan is to split more failing subset out of
ubuntu_ltp (e.g. kernel_misc, fs) to improve our hinting quality. With
more LTP sub-tests break down into smaller pieces we can rethink if we
want to adjust our test plan for derivative kernels.

I will be adjusting the tests in ubuntu_ltp_stable too to move failing
pty (lp:1922819) sched (lp:1931325) test out to make sure it's green.

Also, we're not running ubuntu_ltp_syscalls test on generic kernels on
baremetals nor on cloud instances. Base on the discussion above I
think this should be added.

Please feel free to comment.
Thanks
Sam

> The cgroups were a terribly unstable interface so maybe that is one of
> the issues. But anyway LTP is expecting that kernel memory
> accounting/charging will follow some imaginary rules and this is simply
> wrong. How kernel accounts memory per groups is not part of API or ABI.
> These are internals which can change from release to release. I fixed up
> memcg tests now but they will keep failing every X kernel releases.
>
> In the same time most of controllers interface and behavior is I think
> tested by kernel selftests so duplicating these with a poorly designed
> LTP controllers tests is okay if we have spare time. But we are closer
> to ENOTENOUGHTIME.
>
> Best regards,
> Krzysztof