ACK/Cmnt: [bionic:linux-azure][PATCH 0/1] Kdump freezes on 4.15 kernel in Azure

Tim Gardner tim.gardner at canonical.com
Wed Jul 28 11:53:34 UTC 2021


Acked-by: Tim Gardner <tim.gardner at canonical.com>

The target kernel is really bionic:linux-azure-4.15

On 7/23/21 12:14 PM, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1882623
> 
> [Impact]
> 
> * When kdumping on trusty/4.15 in an Azure instance, we observe quite frequently
> a stall on the kdump kernel, it gets blocked and soon we see a stack like
> the following:
> 
> [ 65.452007] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 65.456004] 1-...!: (0 ticks this GP) idle=488/0/0 softirq=1/1 fqs=0
> [ 65.456004] (detected by 0, t=15002 jiffies, g=707, c=706, q=8457)
> [ 65.456004] rcu_sched kthread starved for 15002 jiffies! g707 c706 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
> 
> * By using the Azure serial console, we collected a sysrq-w when the issue happens:
> [ 529.515013] sysrq: Show Blocked State
> [ 529.517730] task PC stack pid father
> [ 529.519006] kworker/u4:2 D 0 94 2 0x80000000
> [ 529.519006] Workqueue: events_unbound fsnotify_mark_destroy_workfn
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? check_preempt_wakeup+0x162/0x260
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __synchronize_srcu.part.14+0x67/0x80
> [ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
> [ 529.519006] ? __switch_to_asm+0x41/0x70
> [ 529.519006] synchronize_srcu+0xd1/0xd6
> [ 529.519006] fsnotify_mark_destroy_workfn+0x6d/0xc0
> [ 529.519006] process_one_work+0x14e/0x390
> [ 529.519006] worker_thread+0x1cc/0x3d0
> [ 529.519006] kthread+0x105/0x140
> [ 529.519006] ? max_active_store+0x60/0x60
> [ 529.519006] ? kthread_bind+0x20/0x20
> [ 529.519006] ret_from_fork+0x35/0x40
> [ 529.519006] udevadm D 0 544 1 0x00000000
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? try_to_wake_up+0x4a/0x460
> [ 529.519006] ? try_to_wake_up+0x4a/0x460
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __flush_work.isra.29+0x119/0x1b0
> [ 529.519006] ? destroy_worker+0x90/0x90
> [ 529.519006] flush_delayed_work+0x3f/0x50
> [ 529.519006] fsnotify_wait_marks_destroyed+0x15/0x20
> [ 529.519006] fsnotify_destroy_group+0x4e/0xc0
> [ 529.519006] inotify_release+0x1e/0x50
> [ 529.519006] __fput+0xea/0x220
> [ 529.519006] ____fput+0xe/0x10
> [ 529.519006] task_work_run+0x8c/0xb0
> [ 529.519006] exit_to_usermode_loop+0x70/0xa9
> [ 529.519006] do_syscall_64+0x1b5/0x1e0
> [ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6
> [ 529.519006] dhclient D 0 573 572 0x00000000
> [ 529.519006] Call Trace:
> [ 529.519006] __schedule+0x292/0x880
> [ 529.519006] schedule+0x36/0x80
> [ 529.519006] schedule_timeout+0x1d5/0x2f0
> [ 529.519006] ? aa_profile_af_perm+0xb4/0xf0
> [ 529.519006] wait_for_completion+0xa5/0x110
> [ 529.519006] ? wake_up_q+0x80/0x80
> [ 529.519006] __wait_rcu_gp+0x123/0x150
> [ 529.519006] synchronize_sched+0x4e/0x60
> [ 529.519006] ? __call_rcu+0x2f0/0x2f0
> [ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
> [ 529.519006] synchronize_net+0x1c/0x30
> [ 529.519006] __unregister_prot_hook+0xcd/0xf0
> [ 529.519006] packet_do_bind+0x1bd/0x250
> [ 529.519006] packet_bind+0x2f/0x50
> [ 529.519006] SYSC_bind+0xd8/0x110
> [ 529.519006] ? sock_alloc_file+0x91/0x130
> [ 529.519006] SyS_bind+0xe/0x10
> [ 529.519006] do_syscall_64+0x80/0x1e0
> [ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6
> 
> * Bisecting mainline kernels, we found that v4.17-rc1 didn't reproduce the issue,
> whereas v4.16 reproduced. Then, a fine-grained git bisect led us to the fix - the
> following patch, when backported to a problematic version, fixes the issue:
> d8e462e19305 ("Drivers: hv: vmbus: Implement Direct Mode for stimer0")
> 
> * In Azure/Hyper-V, before the aforementioned commit, timer interrupts were passed
> to the hypervisor through a vmbus message, a mechanism of communication of hyper-v
> guests/hypervisor. With the patch, a check is made (through MSR-like mechanism)
> and if the hypervisor supports, a direct timer IRQ mechanism is put in-place
> instead of the vmbus channel.
> 
> * Our theory is that on kdump kernel, specially due to the single cpu nature,
> the vmbus-messaged timer IRQ could interfere with scheduling and create a dead-lock
> condition, which is what we observe from the stack traces. Hence, we hereby
> propose to backport such patch to azure 4.15 kernels.
> 
> * Our hypothesis about why this wasn't observed in Bionic is that changes in the
> kdump mechanism make it start earlier, and this might hide the schedule issue.
> Lack of testing also could be the cause of the non-observability.
> 
> [Test Case]
> 
> * Perform a kdump in Trusty/4.15 and after some rounds, you should see the
> mentioned RCU stack trace. With the patch, no issue is observed, and we could
> perform ~30 kdumps in a row.
> 
> [Where problems could occur]
> 
> * Since the patch touches IRQ code (although restricted to Hyper-v), it's a
> somewhat invasive change. It could lead to issues in the IRQ subsystem, although
> the patch is upstream since 2018 and no fixes for that were found (we count on
> Fixes tag for that).
> 
> 
> Michael Kelley (1):
>    Drivers: hv: vmbus: Implement Direct Mode for stimer0
> 
>   arch/x86/entry/entry_32.S          |  3 ++
>   arch/x86/entry/entry_64.S          |  3 ++
>   arch/x86/include/asm/hardirq.h     |  3 ++
>   arch/x86/include/asm/hyperv-tlfs.h |  3 ++
>   arch/x86/include/asm/irq_vectors.h |  3 +-
>   arch/x86/include/asm/mshyperv.h    | 13 +++++++
>   arch/x86/kernel/cpu/mshyperv.c     | 40 ++++++++++++++++++++
>   arch/x86/kernel/irq.c              |  7 ++++
>   drivers/hv/hv.c                    | 59 ++++++++++++++++++++++++++++--
>   drivers/hv/hyperv_vmbus.h          |  4 +-
>   10 files changed, 133 insertions(+), 5 deletions(-)
> 

-- 
-----------
Tim Gardner
Canonical, Inc



More information about the kernel-team mailing list