Cmnt: [SRU][F/aws][PATCH v2 0/6] aws: proper fix for c5.18xlarge hibernation issues

Guilherme Piccoli gpiccoli at canonical.com
Tue May 18 18:41:03 UTC 2021


On Tue, May 18, 2021 at 12:26 PM Andrea Righi
<andrea.righi at canonical.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/1920944
>
> [Impact]
>
> In LP: #1918694 we applied a fix and a workaround to solve the
> hibernation issues on c5.18xlarge. The workaround was in the form of a
> SAUCE patch:
>
>   "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"
>
> It looks like we can replace this workaround with a proper fix, by
> applying this patch:
>
> http://next.patchew.org/Linux/20210414123544.1060604-1-vkuznets@redhat.com/
>
> [Test plan]
>
> Create a c5.18xlarge instance, run the memory stress test script (the
> same test script that we are using to stress test hibernation), trigger
> the hibernate event, trigger the resume event. Repeat a couple of times
> and the problem is very likely to happen.
>
> [Fix]
>
> Replace "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"
> with:
>
> http://next.patchew.org/Linux/20210414123544.1060604-1-vkuznets@redhat.com/
>
> The fix has been tested extensively in the AWS infrastructure with
> positive results.
>
> [Where problems could occur]
>
> This new code introduced by the fix can be executed also when a CPU is
> put offline, so we may see potential regressions in the KVM CPU
> hotplugging.
>
> ----------------------------------------------------------------
> Changelog (v1 -> v2):
>  - new patch set from readhat
>
> NOTE: backport activity was minimal, it only required some context
> adjustments to properly apply the changes.
>
> Andrea Righi (1):
>       Revert "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"
>
> Vitaly Kuznetsov (5):
>       x86/kvm: Fix pr_info() for async PF setup/teardown
>       x86/kvm: Teardown PV features on boot CPU as well
>       x86/kvm: Disable kvmclock on all CPUs on shutdown
>       x86/kvm: Disable all PV features on crash
>       x86/kvm: Unify kvm_pv_guest_cpu_reboot() with kvm_guest_cpu_offline()
>
>  arch/x86/include/asm/kvm_para.h |   9 ++----
>  arch/x86/kernel/kvm.c           | 113 ++++++++++++++++++++++++++++++++++++++++++++----------------------
>  arch/x86/kernel/kvmclock.c      |  28 ++---------------
>  3 files changed, 79 insertions(+), 71 deletions(-)
>
>

Thanks Andrea, very good patchset to have in our kernels!
I'm ready to ACK, but I'd like to clarify the following before:

(a) Should it be in 5.8/5.11 as well?

(b) Should it be sent to main kernel and get pulled by all
derivatives, or really only for -aws?

(c) Also, patches are upstream[0], so should we have the IDs in the commits?

Cheers,

Guilherme


[0]
$ git log -5 --oneline arch/x86/kernel/kvm.c
384fc672f528 x86/kvm: Unify kvm_pv_guest_cpu_reboot() with
kvm_guest_cpu_offline()
3d6b84132d2a x86/kvm: Disable all PV features on crash
c02027b5742b x86/kvm: Disable kvmclock on all CPUs on shutdown
8b79feffeca2 x86/kvm: Teardown PV features on boot CPU as well
0a269a008f83 x86/kvm: Fix pr_info() for async PF setup/teardown



More information about the kernel-team mailing list