APPLIED: [B][PATCH 0/2] Fix for LP#1821259 (pending patches for) Fix for deadlock in cpu_stopper
Khaled Elmously
khalid.elmously at canonical.com
Thu Mar 28 03:05:20 UTC 2019
On 2019-03-21 20:48:34 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1821259
>
> Bionic only needs 2 of the 4 patches submitted for Xenial.
> All patches are applied / not needed on Cosmic and later.
>
> [Impact]
>
> * This problem hard locks up 2 CPUs in a deadlock, and this
> soft locks up other CPUs as an effect; the system becomes
> unusable.
>
> * This is relatively rare / difficult to hit because it's a
> corner case in scheduling/load balancing that needs timing
> with CPU stopper code. And it needs SMP plus _NUMA_ system.
> (but it can be hit with synthetic test case attached in LP.)
>
> * Since SMP plus NUMA usually equals _servers_ it looks like
> a good idea to prevent this bug / hard lockups / rebooting.
>
> * The fix resolves the potential deadlock by removing one of
> the calls required to deadlock from under the locked code.
>
> [Test Case]
>
> * There's a synthetic test case to reproduce this problem
> (although without the stack traces - just a system hang)
> attached to this LP bug.
>
> * It uses kprobes/mdelay/cpu stopper calls to force the code
> to execute and force the timing/locking condition to occur.
>
> * $ sudo insmod kmod-stopper.ko
>
> Some dmesg logging occurs, and systems either hangs or not.
> See examples in comments.
>
> [Regression Potential]
>
> * These are patches to the cpu stop_machine.c code, and they
> change a bit how it works; however, there are no upstream
> fixes for these patches anymore and they are still the top
> of the 'git log --oneline -- kernel/stop_machine.c' output.
>
> * These patches have been verified with the synthetic test case
> and 'stress-ng --class scheduler --sequential 0' (no regressions)
> on guest with 2 CPUs and one physical system with 24 CPUs.
>
> [Other Info]
>
> * The patches are required on Xenial and later.
> * There are 4 patches for Xenial, and 2 patches pending for Bionic.
> * All patches are applied from Cosmic onwards.
>
> Isaac J. Manjarres (1):
> stop_machine: Disable preemption after queueing stopper threads
>
> Prasad Sodagudi (1):
> stop_machine: Atomically queue and wake stopper threads
>
> kernel/stop_machine.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list