APPLIED: [bionic:linux-azure][PATCH 0/1] Kdump freezes on 4.15 kernel in Azure
Kelsey Skunberg
kelsey.skunberg at canonical.com
Sat Aug 21 02:26:02 UTC 2021
Applied to B/azure-4.15 master-next. Thank you!
-Kelsey
On 2021-07-23 15:14:52 , Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1882623
>
> [Impact]
>
> * When kdumping on trusty/4.15 in an Azure instance, we observe quite frequently
> a stall on the kdump kernel, it gets blocked and soon we see a stack like
> the following:
>
> [ 65.452007] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 65.456004] 1-...!: (0 ticks this GP) idle=488/0/0 softirq=1/1 fqs=0
> [ 65.456004] (detected by 0, t=15002 jiffies, g=707, c=706, q=8457)
> [ 65.456004] rcu_sched kthread starved for 15002 jiffies! g707 c706 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
>
> * By using the Azure serial console, we collected a sysrq-w when the issue happens:
> [ 529.515013] sysrq: Show Blocked State
> [ 529.517730] task PC stack pid father
> [ 529.519006] kworker/u4:2 D 0 94 2 0x80000000
> [ 529.519006] Workqueue: events_unbound fsnotify_mark_destroy_workfn
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? check_preempt_wakeup+0x162/0x260
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __synchronize_srcu.part.14+0x67/0x80
> [ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
> [ 529.519006] ? __switch_to_asm+0x41/0x70
> [ 529.519006] synchronize_srcu+0xd1/0xd6
> [ 529.519006] fsnotify_mark_destroy_workfn+0x6d/0xc0
> [ 529.519006] process_one_work+0x14e/0x390
> [ 529.519006] worker_thread+0x1cc/0x3d0
> [ 529.519006] kthread+0x105/0x140
> [ 529.519006] ? max_active_store+0x60/0x60
> [ 529.519006] ? kthread_bind+0x20/0x20
> [ 529.519006] ret_from_fork+0x35/0x40
> [ 529.519006] udevadm D 0 544 1 0x00000000
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? try_to_wake_up+0x4a/0x460
> [ 529.519006] ? try_to_wake_up+0x4a/0x460
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __flush_work.isra.29+0x119/0x1b0
> [ 529.519006] ? destroy_worker+0x90/0x90
> [ 529.519006] flush_delayed_work+0x3f/0x50
> [ 529.519006] fsnotify_wait_marks_destroyed+0x15/0x20
> [ 529.519006] fsnotify_destroy_group+0x4e/0xc0
> [ 529.519006] inotify_release+0x1e/0x50
> [ 529.519006] __fput+0xea/0x220
> [ 529.519006] ____fput+0xe/0x10
> [ 529.519006] task_work_run+0x8c/0xb0
> [ 529.519006] exit_to_usermode_loop+0x70/0xa9
> [ 529.519006] do_syscall_64+0x1b5/0x1e0
> [ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6
> [ 529.519006] dhclient D 0 573 572 0x00000000
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? aa_profile_af_perm+0xb4/0xf0
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __wait_rcu_gp+0x123/0x150
> [ 529.519006] synchronize_sched+0x4e/0x60
> [ 529.519006] ? __call_rcu+0x2f0/0x2f0
> [ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
> [ 529.519006] synchronize_net+0x1c/0x30
> [ 529.519006] __unregister_prot_hook+0xcd/0xf0
> [ 529.519006] packet_do_bind+0x1bd/0x250
> [ 529.519006] packet_bind+0x2f/0x50
> [ 529.519006] SYSC_bind+0xd8/0x110
> [ 529.519006] ? sock_alloc_file+0x91/0x130
> [ 529.519006] SyS_bind+0xe/0x10
> [ 529.519006] do_syscall_64+0x80/0x1e0
> [ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6
>
> * Bisecting mainline kernels, we found that v4.17-rc1 didn't reproduce the issue,
> whereas v4.16 reproduced. Then, a fine-grained git bisect led us to the fix - the
> following patch, when backported to a problematic version, fixes the issue:
> d8e462e19305 ("Drivers: hv: vmbus: Implement Direct Mode for stimer0")
>
> * In Azure/Hyper-V, before the aforementioned commit, timer interrupts were passed
> to the hypervisor through a vmbus message, a mechanism of communication of hyper-v
> guests/hypervisor. With the patch, a check is made (through MSR-like mechanism)
> and if the hypervisor supports, a direct timer IRQ mechanism is put in-place
> instead of the vmbus channel.
>
> * Our theory is that on kdump kernel, specially due to the single cpu nature,
> the vmbus-messaged timer IRQ could interfere with scheduling and create a dead-lock
> condition, which is what we observe from the stack traces. Hence, we hereby
> propose to backport such patch to azure 4.15 kernels.
>
> * Our hypothesis about why this wasn't observed in Bionic is that changes in the
> kdump mechanism make it start earlier, and this might hide the schedule issue.
> Lack of testing also could be the cause of the non-observability.
>
> [Test Case]
>
> * Perform a kdump in Trusty/4.15 and after some rounds, you should see the
> mentioned RCU stack trace. With the patch, no issue is observed, and we could
> perform ~30 kdumps in a row.
>
> [Where problems could occur]
>
> * Since the patch touches IRQ code (although restricted to Hyper-v), it's a
> somewhat invasive change. It could lead to issues in the IRQ subsystem, although
> the patch is upstream since 2018 and no fixes for that were found (we count on
> Fixes tag for that).
>
>
> Michael Kelley (1):
> Drivers: hv: vmbus: Implement Direct Mode for stimer0
>
> arch/x86/entry/entry_32.S | 3 ++
> arch/x86/entry/entry_64.S | 3 ++
> arch/x86/include/asm/hardirq.h | 3 ++
> arch/x86/include/asm/hyperv-tlfs.h | 3 ++
> arch/x86/include/asm/irq_vectors.h | 3 +-
> arch/x86/include/asm/mshyperv.h | 13 +++++++
> arch/x86/kernel/cpu/mshyperv.c | 40 ++++++++++++++++++++
> arch/x86/kernel/irq.c | 7 ++++
> drivers/hv/hv.c | 59 ++++++++++++++++++++++++++++--
> drivers/hv/hyperv_vmbus.h | 4 +-
> 10 files changed, 133 insertions(+), 5 deletions(-)
>
> --
> 2.31.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list