ACK: [SRU][J][PATCH 0/2] 5.15.0-85 live migration regression

Roxana Nicolescu roxana.nicolescu at canonical.com
Wed Sep 20 07:06:52 UTC 2023


On 20/09/2023 04:22, Chengen Du wrote:
> BugLink: https://bugs.launchpad.net/bugs/2036675
>
> SRU Justification:
>
> [Impact]
> The fixes introduced for LP#2032164, aimed at resolving a live migration issue, have unintentionally led to a regression.
> Consequently, a previously functional live migration pattern now fails when tested with the 5.15.0-85 kernel from -proposed.
>
> Specifically, live migration from a PKRU-enabled host running a kernel version older than 5.15.0-85 to a host utilizing the 5.15.0-85 kernel will result in a failure.
> It's important to note that this issue occurs regardless of whether the destination host has PKRU enabled or not.
> In both scenarios, the live migration fails, albeit manifesting in different ways — one leads to a hang, while the other fails due to a PCID flag issue.
>
> [Fix]
> To address the issue introduced in LP#2032164, we will begin by reverting the following commits.
> Subsequently, we will actively pursue a more comprehensive solution.
>
> commit fa9225d64f215e8109de10f6b6c7a08f033d0ec0
> Author: Dr. David Alan Gilbert <dgilbert at redhat.com>
> Date: Mon Aug 21 14:47:28 2023 +0800
>
>      KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES
>
> commit 27a189b881278c8ad9c16b0ee05668d724352733
> Author: Leonardo Bras <leobras at redhat.com>
> Date: Mon Aug 21 14:47:27 2023 +0800
>
>      x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
>
> [Test Plan]
> The issue resolved in LP#2032164 will reoccur.
> To reproduce this problem, follow these steps:
> 1. Set up two machines: one with PKRU support and the other without.
> 2. Initiate a guest that lacks PKRU support on the machine with PKRU support.
> 3. Utilize libvirt to migrate the aforementioned guest to a different machine that lacks PKRU support.
> 4. The error emerges on the destination machine:
> KVM: entry failed, hardware error 0x80000021
>
> If you're running a guest on an Intel machine without unrestricted mode
> support, the failure can be most likely due to the guest entering an invalid
> state for Intel VT. For example, the guest maybe running in big real mode
> which is not supported on less recent Intel processors.
>
> EAX=86cf7970 EBX=00000000 ECX=00000001 EDX=005b0036
> ESI=00000087 EDI=00000087 EBP=87c03e38 ESP=87c03e18
> EIP=86cf7d5e EFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 00000000 0000ffff 00009300
> CS =f000 ffff0000 0000ffff 00009b00
> SS =0000 00000000 0000ffff 00009300
> DS =0000 00000000 0000ffff 00009300
> FS =0000 00000000 0000ffff 00009300
> GS =0000 00000000 0000ffff 00009300
> LDT=0000 00000000 0000ffff 00008200
> TR =0000 00000000 0000ffff 00008b00
> GDT= 00000000 0000ffff
> IDT= 00000000 0000ffff
> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 2023-07-09T03:03:14.911750Z qemu-system-x86_64: terminating on signal 15 from pid 4134 (/usr/sbin/libvirtd)
> 2023-07-09 03:03:15.312+0000: shutting down, reason=destroyed
>
> [Where problems could occur]
> We've reverted the commits to revert the behavior to the original one,
> but the issue from LP#2032164 still persists.
>
> Chengen Du (2):
>    Revert "KVM: x86: Always enable legacy FP/SSE in allowed user
>      XFEATURES"
>    Revert "x86/kvm/fpu: Limit guest user_xfeatures to supported bits of
>      XCR0"
>
>   arch/x86/kvm/cpuid.c | 8 --------
>   1 file changed, 8 deletions(-)
>
Acked-by: Roxana Nicolescu <roxana.nicolescu at canonical.com>



More information about the kernel-team mailing list