ACK: [SRU][N][PULL] L2 Guest migration: continuously dumping while running NFS guest migration

Thibault Ferrante thibault.ferrante at canonical.com
Mon Sep 23 13:01:43 UTC 2024


Acked-by: Thibault Ferrante <thibault.ferrante at canonical.com>

On 02-09-2024 10:39, frank.heimes at canonical.com wrote:
> BugLink: https://bugs.launchpad.net/bugs/2076406
> 
> SRU Justification:
> 
> [ Impact ]
> 
>   * While doing ISST testing it turned out that a 2nd level (KVM)
>     guest (aka VM) continuously dumped when running an NFS
>     guest migration.
> 
> [ Test Plan ]
> 
>   * Setup two IBM Power 10 system (with firmware 1060, that offers
>     support for KVM) with Ubuntu Server 24.04 for ppc64el.
> 
>   * Setup qemu/KVM on both on these system to allow guest migration.
> 
>   * Setup a KVM guest and place its disk on an NFS volume.
> 
>   * Now initiate a guest migration.
> 
>   * Without the two patches the initiator system will start to dump.
> 
>   * Since this setup requires a special firmware level,
>     the verification will be done by the IBM Power team.
> 
> [ Where problems could occur ]
> 
>   * Although the patch set looks huge,
>     the patches themselves are relatively small and less invasive
>     and I would consider them mainly as fixes.
> 
>   * kvmppc_set_one_reg_hv() wrongly get() the value instead of
>     set() for MMCR3.
> 
>   * And The kvmppc_get_one_reg_hv() for SDAR is wrongly getting
>     the SIAR instead of SDAR - which is quite traceable.
> 
>   * Then a one-reg interface for DEXCR register KVM_REG_PPC_DEXCR
>     is introduced. Here issues can happen if the initialization
>     is done wrong or in the case statement.
>     A fix was added to keep nested guest DEXCR in sync.
>     The guest state element defined for DEXCR was already there,
>     but not really considered - this is fixed now (DEXCR GSID).
>     If initialization was done wrong or code in case stmt,
>     this can harm the guest state.
>     Guest state may get out of sync.
> 
>   * Another one-reg register identifier was introduced
>     that is used to read and set the virtual HASHKEYR
>     for the guest during enter/exit with KVM_REG_PPC_HASHKEYR.
>     Again initialization and the case code are critical.
>     Code was added to keep nested guest HASHKEYR in sync.
>     Again the state element defined for HASHKEYR was there,
>     but not considered, what is fixed now (HASHKEYR GSID)
>     If initialization was done wrong or code in case stmt,
>     this can harm the guest state.
>     This can harm the L2 guest during enter or exit.
> 
>   * Again another one-reg identifier was introduced
>     that is used to read and set the virtual HASHPKEYR
>     for the guest during enter/exit with KVM_REG_PPC_HASHPKEYR.
>     And again the guest state element defined for HASHPKEYR
>     was there but ignored which is now fixed (HASHPKEYR GSID).
>     If initialization was done wrong or code in case stmt,
>     this can harm the guest state.
>     This can harm the L2 guest during enter or exit.
> 
> [ Other Info ]
> 
>   * Since (nested) KVM support is new on P10,
>     this does not affect older Power generation
>     (P9 is the only other hw generation that is supported by 24.04,
>     but it only supports native virtualization).
> 
>   * Both patches are upstream accepted since v6.11(-rc1),
>     hence will be in oracular
>     and are also upstream tagged as stable updates.
> 
>   * Since the required firmware FW1060 is relatively new,
>     we can assume that not many user ran into this issue yet.
> 
> The following changes since commit 219da5546e11baa4a535f6dfbc872a105c8d0892:
> 
>    md: fix deadlock between mddev_suspend and flush bio (2024-08-30 08:44:25 +0200)
> 
> are available in the Git repository at:
> 
>    https://git.launchpad.net/~fheimes/+git/lp2076406/ 5b3b9fba52edd3f48596bb32771a1c0ebaa1a093
> 
> for you to fetch changes up to 5b3b9fba52edd3f48596bb32771a1c0ebaa1a093:
> 
>    KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHPKEYR in sync (2024-08-30 15:26:52 +0200)
> 
> ----------------------------------------------------------------
> Shivaprasad G Bhat (8):
>        KVM: PPC: Book3S HV: Fix the set_one_reg for MMCR3
>        KVM: PPC: Book3S HV: Fix the get_one_reg of SDAR
>        KVM: PPC: Book3S HV: Add one-reg interface for DEXCR register
>        KVM: PPC: Book3S HV nestedv2: Keep nested guest DEXCR in sync
>        KVM: PPC: Book3S HV: Add one-reg interface for HASHKEYR register
>        KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHKEYR in sync
>        KVM: PPC: Book3S HV: Add one-reg interface for HASHPKEYR register
>        KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHPKEYR in sync
> 
>   Documentation/virt/kvm/api.rst        |  3 +++
>   arch/powerpc/include/asm/kvm_host.h   |  3 +++
>   arch/powerpc/include/uapi/asm/kvm.h   |  3 +++
>   arch/powerpc/kvm/book3s_hv.c          | 22 ++++++++++++++++++++--
>   arch/powerpc/kvm/book3s_hv.h          |  3 +++
>   arch/powerpc/kvm/book3s_hv_nestedv2.c | 18 ++++++++++++++++++
>   6 files changed, 50 insertions(+), 2 deletions(-)
> 


-- 
--
Thibault



More information about the kernel-team mailing list