[PATCH] [act] UBUNTU: SAUCE: ubuntu_kernel_selftests: disable memory-hotplug
Paolo Pisati
paolo.pisati at canonical.com
Wed Jun 23 13:13:52 UTC 2021
The memory-hotplug test has been intermittently timing out (or trashing the test
VM, see below) on Impish/Hirsute ppc64el and x86-64 for quite some time now.
Upon further investigation, we found that memory-hotplug has a tendency to spam
the system logs (kernel.log, syslog and the systemd-journal) with thousands and
thousands (up to several GBs) of dump_page() entries like this:
...
[ 898.286185] migrating pfn 11c462 failed ret:1
[ 898.286186] page:00000000491a3636 refcount:3 mapcount:0 mapping:00000000e646cbed index:0xc00066 pfn:0x11c462
[ 898.286188] memcg:ffff947290991000
[ 898.286188] aops:def_blk_aops ino:800002
[ 898.286191] flags: 0x17ffffc0002022(referenced|active|private|node=0|zone=2|lastcpupid=0x1fffff)
[ 898.286193] raw: 0017ffffc0002022 ffffb3618ba03ba8 ffffb3618ba03ba8 ffff947287522ab0
[ 898.286195] raw: 0000000000c00066 ffff947281729340 00000003ffffffff ffff947290991000
[ 898.286196] page dumped because: migration failure
...
At this point, two things can happen:
a) the constant flow of printk() slows down the VM to the point a timeout
triggers (either autotest timeout or kernel selftests timeout, it doesn't
matter), terminates memory-hotplug and the VM resume processing the remaning
ubuntu_kernel_selftests jobs
or
b) the filesystem fills up to 100%, memory-hotplug fails, but so does every
remaining test jobs since the VM is in an unusable state at this point
Given we already disable memory-hotplug for arm* and cloud kernels, and to avoid
having our tests session be trashed by this single test, i propose to disable it
entirely, or at least until a ratelimit solution is put in place.
If you want to reproduce this issue, just provision an openstack instance
(small, medium or large - size doesn't matter) and you will always endup in
scenario "b".
Signed-off-by: Paolo Pisati <paolo.pisati at canonical.com>
---
ubuntu_kernel_selftests/control | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ubuntu_kernel_selftests/control b/ubuntu_kernel_selftests/control
index e2874196..53c5584a 100644
--- a/ubuntu_kernel_selftests/control
+++ b/ubuntu_kernel_selftests/control
@@ -12,7 +12,7 @@ DOC = ""
name = 'ubuntu_kernel_selftests'
-tests = [ 'setup','breakpoints','cpu-hotplug','efivarfs','memfd','memory-hotplug','mount','net','ptrace','seccomp','timers','powerpc','user','ftrace' ]
+tests = [ 'setup','breakpoints','cpu-hotplug','efivarfs','memfd','mount','net','ptrace','seccomp','timers','powerpc','user','ftrace' ]
#
# The seccomp tests on 4.19+ on non-x86 are known to be fail and
--
2.25.1
More information about the kernel-team
mailing list