Reducing Regression Test suite: LTP

Krzysztof Kozlowski krzysztof.kozlowski at canonical.com
Tue Jul 13 11:03:42 UTC 2021


On 13/07/2021 12:50, Thadeu Lima de Souza Cascardo wrote:
> On Tue, Jul 13, 2021 at 12:18:09PM +0200, Krzysztof Kozlowski wrote:
>> Hi all,
>>
>> We talked about possibility of reducing our regression test suite. I
>> have a candidate for this - LTP (Linux Test Project).
>>
>> Each run of full LTP takes around 4 hours (2 - 2.5h for ubuntu_ltp, 40
>> minutes for ubuntu_ltp_stable and ~1h for ubuntu_ltp_syscalls). I looked
>> at cloud instances (4 and 48 cores).
>>
>> LTP tests everything: known kernel bugs and CVEs, kernel syscalls and
>> user-space interfaces, network and probably more. It is a huge test suite.
>>
> 
> Which is why it is so valuable. It not only tests that kernels interfaces are
> behaving as expected, but also exercises them, preventing us from missing an
> important regression.

Thanks for the comments. All of the regression tests are valuable. Not
only LTP. However with this approach we might never reduce them...

> 
>> I was looking at LTP a lot last month on cloud instances and 95% of
>> failures were flaky test or instance related. The remaining 5% were
>> indeed missing kernel commits for fixes but not specific to derivative,
>> but to a generic kernel (e.g. missing backports for Bionic).
>>
>> It is rather unlikely that LTP will pass on main kernel but will fail on
>> a derivative because of the kernel issue. More likely is that the
>> failure will be seen on the main kernel as well.
> 
> Our derivative kernels carry specific patches and have different
> configurations. Some times, a patchset will be submitted and be tested on the
> generic kernel, but not on all derivatives. I still find it is valuable that we
> test derivative kernels as much as we can.

Chances that cloud derivatives will hit issue related to a separate
configuration or cloud-specific patch are very, very low. Of course it
is always possible but we are going to first paragraph - we won't be
able to reduce the test suite at all.

> 
> I agree that some tests are not as robust, and that means we should be
> improving the tests as well, so they bring more value to our testing. But I
> also thought that was why we had split the tests between ubuntu_ltp_stable and
> ubuntu_ltp. That may have brought a stink to the LTP name, unfortunately. Maybe
> it should have been ubuntu_ltp and ubuntu_ltp_unstable. But that is nitpicking.
> 
> If there are a flakey tests on ubuntu_ltp_stable, they should be moved to
> ubuntu_ltp, and then, we can start improving the tests on ubuntu_ltp so they
> can be moved to ubuntu_ltp_stable.

I was not biased by stable name. My observations were related to looking
at LTP for last +1 month (https://trello.com/b/f75oUoQt/kernel-test-issues).

> 
> And LTP is rapidly changing, though they care about the tests being applicable
> to older kernels and older environments. And though you think they are slow on
> picking up your changes, they are fast compared to other projects.

Nope, they are slow. I have 14 patches waiting without any conclusion
for a month. There is no answer like "fix this, I don't like this". They
just hang there waiting for something. Pinging moves them a little bit...

That's the reason we forked LTP.

> I think it's really important that we keep testing their latest versions.

Which we will be doing. We will be testing latest LTP. Just not
everywhere :)

> 
>>
>> Therefore I propose to run full LTP only in some cases:
>> 1. On main kernels (so mostly metal),
>> 2. On HWE kernels (from which we have a derivative edge kernel but HWE
>> is enough),
>> 3. Development kernels (Impish) everywhere,
>> 4. Maybe also OEM kernels?
>>
>> In other cases run only subset of LTP. Maybe only ubuntu_ltp_syscalls?
> 
> Definitively ubuntu_ltp_syscalls. If ubuntu_ltp_stable is not stable enough, we
> should prioritize fixing the tests on ltp_stable instead of the ones in
> ltp_unstable. And reviewing which tests are there may be a good step too, so we
> can figure out if this is testing glibc or the kernel or the hw instead.


Best regards,
Krzysztof



More information about the kernel-team mailing list