ACK: [SRU][J][PATCH v2 0/2] KVM: arm64: fix softlockups in stage2_apply_range
Tim Gardner
tim.gardner at canonical.com
Thu Mar 7 14:12:32 UTC 2024
On 3/5/24 17:50, Krister Johansen wrote:
> BugLink: https://bugs.launchpad.net/bugs/2056227
>
> [Impact]
>
> Tearing down kvm VMs on arm64 can cause softlockups to appear on console. When
> terminating VMs with > 100Gb of memory and 4k pages, the memory unmap times
> often exceed 20 seconds, which can trigger the softlockup detector. Portions of
> the unmap path also have interrupts disabled while tlb invalidation instructions
> run, which can further contribute to latency problems. My team has observed
> networking latency problems if the cpu where the teardown is occurring is also
> mapped to handle a NIC interrupt.
>
> Fortunately, a solution has been in place since Linux 6.1. A small pair of
> patches modify stage2_apply_range to operate on smaller memory ranges before
> performing a cond_resched. With these patches applied, softlockups are no
> longer observed when tearing down VMs with large amounts of memory.
>
> Although I also submitted the patches to 5.15 LTS (link to LTS submission in
> "Backport" section), I'd appreciate it if Ubuntu were willing to take this
> submission in parallel since the impact has left us unable to utilize arm64 for
> kvm until we can either migrate our hypervisors to hugepages, pick up this fix,
> or some combination of the two.
>
> [Backport]
>
> Backport the following fixes from linux 6.1:
>
> 3b5c082bbf KVM: arm64: Work out supported block level at compile time
> 5994bc9e05 KVM: arm64: Limit stage2_apply_range() batch size to largest block
>
> The fix is in 5994bc9e05 and 3b5c082bbf is a dependency that was submitted as
> part of the series. The original submission is here:
>
> https://lore.kernel.org/all/20221007234151.461779-1-oliver.upton@linux.dev/
>
> I've also submitted the patches to 5.15 LTS here:
>
> https://lore.kernel.org/stable/cover.1709665227.git.kjlx@templeofstupid.com/
>
> Both fixes cherry picked cleanly and there were no conflicts.
>
> [Test]
>
> Executed a variation of the test from 5994bc9e05 as well as my own run of
> kvm_page_table_test on a VM with 4k pages and a memory size > 100Gb. Without
> the patches, softlockups were observed in both tests. With the patches applied,
> the tests ran without incident.
>
> This was tested against both LTS 5.15.150 and linux-aws-5.15.0-1055.
>
> [Potential Regression]
>
> Regression potential is low. These patches have been present in Linux since 6.1
> and appear to have needed no further maintenance.
>
> [Change in v2]
>
> I ran format-patch without the --from option which incorrectly generated the
> first series without leaving Oliver in place as the author. The v2 should
> retain the correct authorship. Apologies for the mistake.
>
>
> Oliver Upton (2):
> KVM: arm64: Work out supported block level at compile time
> KVM: arm64: Limit stage2_apply_range() batch size to largest block
>
> arch/arm64/include/asm/kvm_pgtable.h | 18 +++++++++++++-----
> arch/arm64/include/asm/stage2_pgtable.h | 20 --------------------
> arch/arm64/kvm/mmu.c | 9 ++++++++-
> 3 files changed, 21 insertions(+), 26 deletions(-)
>
Acked-by: Tim Gardner <tim.gardner at canonical.com>
--
-----------
Tim Gardner
Canonical, Inc
More information about the kernel-team
mailing list