[SRU Focal 0/3] LP: #1996678 - containerd regressions

Thadeu Lima de Souza Cascardo cascardo at canonical.com
Wed Nov 23 17:48:33 UTC 2022


[Impact]
Multiple users and partners have reported containerd/k8s/runc failures.
These manifest as containers being in the unknown state and failures
to communicate with containerd (response timeouts).

[Fixes]
Some reports attribute the failure to epoll, in particular commit a16ceb139610.
Others have attribute containerd failures to splice commit 97ef77c52b78.

The epoll commit was applied upstream after many other fixes were applied to
epoll. Those other fixes prevent race conditions between epoll_pwait timing out
and wakeups being missed. a16ceb139610 upstream would make such races produce
extra events, not miss events. But on current 5.4 kernel, specially without
commit 289caf5d8f6c, this would result in events more likely being missed.

Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and not
reverting a16ceb139610, the number of responses timeouts compared to either
Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610) go
down.

Also, 5.15 carry some epoll kselftests and at least epoll61 test passes once
those fixes are applied, and no other regressions in that test suite are found.
That particular test is demonstrating the described race condition from commit
289caf5d8f6c.

97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y as it
depended on a lot of other changes that have not been backported or targeted to
5.4. In particular, the reasoning upstream was NFS regressions.

In light of all this, even though these fixes may not be necessary to fix the
reported containerd bugs, they should go into all 5.4 kernels anyway as they
fix other real bugs.

[URGENCY]
Applying these fixes as soon as possible is in line with a strategy of getting
fixes out so they can be tested and verified, particularly for those users who
require offical Ubuntu builds. In case these cause more harm or do not fix the
regressions, they can be reverted and better fixes applied instead.


Roman Penyaev (1):
  epoll: call final ep_events_available() check under the lock

Sasha Levin (1):
  Revert "fs: check FMODE_LSEEK to control internal pipe splicing"

Soheil Hassas Yeganeh (1):
  epoll: check for events when removing a timed out thread from the wait
    queue

 fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
 fs/splice.c    | 10 +++++---
 2 files changed, 48 insertions(+), 31 deletions(-)

-- 
2.34.1




More information about the kernel-team mailing list