ACK: [Impish][SRU][PATCH 0/3] Fix KVM regression on Impish

Kleber Souza kleber.souza at canonical.com
Wed Apr 13 10:39:08 UTC 2022


On 11.04.22 16:01, Po-Hsu Lin wrote:
> [Impact]
> This is caused by commit 08335308 "KVM: x86: check PIR even for vCPUs
> with disabled APICv", this patch needs 7e1901f6c "KVM: VMX: prepare
> sync_pir_to_irr for running with APICv disabled" otherwise if APICv
> is disabled in this vcpu it will trigger warning messages in
> vmx_sync_pir_to_irr() of vmx.c:
>      WARN_ON(!vcpu->arch.apicv_active);
> 
> With warnings like:
> ------------[ cut here ]------------
> WARNING: CPU: 13 PID: 6997 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
> ? xfer_to_guest_mode_work+0xe2/0x110
> Modules linked in: vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm joydev input_leds ioatdma rapl intel_cstate efi_pstore ipmi_si mei_me mei mac_hid acpi_pad
> vcpu_run+0x4d/0x220 [kvm]
> acpi_power_meter sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid mgag200 i2c_algo_bit drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul sysfillrect sysimgblt ghash_clmulni_intel fb_sys_fops ixgbe cec aesni_intel rc_core crypto_simd xfrm_algo cryptd drm ahci dca i2c_i801 xhci_pci mdio libahci i2c_smbus lpc_ich xhci_pci_renesas wmi
> CPU: 13 PID: 6997 Comm: qemu-system-x86 Tainted: G W I 5.13.0-39-generic #44-Ubuntu
> Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.1008.031920151331 03/19/2015
> kvm_arch_vcpu_ioctl_run+0xc5/0x4f0 [kvm]
> RIP: 0010:vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
> Code: e8 47 f5 18 00 8b 93 00 03 00 00 89 45 ec 83 e2 20 85 d2 74 dc 48 8b 55 f0 65 48 2b 14 25 28 00 00 00 75 1d 48 8b 5d f8 c9 c3 <0f> 0b eb 87 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
> RSP: 0018:ffffae4d8d107c98 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff99c552942640 RCX: ffff99c5043a72f0
> RDX: ffff99c552942640 RSI: 0000000000000001 RDI: ffff99c552942640
> RBP: ffffae4d8d107cb0 R08: ffff99c86f6a7140 R09: 0000000000027100
> R10: 0000000042280000 R11: 000000000000000a R12: ffff99c552942640
> R13: 0000000000000000 R14: ffffae4d8d1a63e0 R15: ffff99c552942640
> FS: 00007f6ae9be7640(0000) GS:ffff99c86f680000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000010b8a6006 CR4: 00000000001726e0
> Call Trace:
> <TASK>
> kvm_vcpu_ioctl+0x243/0x5e0 [kvm]
> vcpu_enter_guest+0x383/0xf50 [kvm]
> ? xfer_to_guest_mode_work+0xe2/0x110
> ? kvm_vm_ioctl+0x364/0x730 [kvm]
> ? __fget_files+0x86/0xc0
> vcpu_run+0x4d/0x220 [kvm]
> __x64_sys_ioctl+0x91/0xc0
> do_syscall_64+0x61/0xb0
> ? fput+0x13/0x20
> ? exit_to_user_mode_prepare+0x37/0xb0
> ? syscall_exit_to_user_mode+0x27/0x50
> ? do_syscall_64+0x6e/0xb0
> ? syscall_exit_to_user_mode+0x27/0x50
> ? do_syscall_64+0x6e/0xb0
> ? do_syscall_64+0x6e/0xb0
> ? do_syscall_64+0x6e/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f6aebce1a2b
> Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 f3 0f 00 f7 d8 64 89 01 48
> RSP: 002b:00007f6ae8ffe3f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6aebce1a2b
> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
> RBP: 0000557d3b429b90 R08: 0000557d3a4ebff0 R09: 00000000ffffffff
> kvm_arch_vcpu_ioctl_run+0xc5/0x4f0 [kvm]
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000001 R14: 0000000000003000 R15: 0000000000000000
> </TASK>
> ---[ end trace 5b722d71a78069b1 ]---
> 
> This warning message will be flooding in system log files and
> eventually eat up all the disk space then crash the server.
> 
> This issue will gone by either reverting it or adding the fixes below.
> 
> Reference:
> https://patchwork.kernel.org/project/kvm/patch/20211118072531.1534938-1-pbonzini@redhat.com/
> 
> [Fixes]
> * 0b8f11737 KVM: Add infrastructure and macro to mark VM as bugged
> * 673692735 KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
> * 7e1901f6c KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
> 
> The fix comes in two fold, the first two patches will fix the warning
> message flooding issue, make it only gets printed once. The third
> patch will change the prevent this to happen.
> 
> The first patch needs to be backported as we're missing:
>    2fdef3a2ae kvm: add PM-notifier
>    fcfe1baedd KVM: stats: Support binary stats retrieval for a VM
> 
> The second patch needs some context adjustment. And the last one can
> be cherry-picked.
> 
> [Test]
> Test kernels can be found here:
> https://people.canonical.com/~phlin/kernel/lp-1966499-kvm-warn-flood/
> 
> This issue can be verified with LXD:
>    1. snap install lxd
>    2. lxc launch images:ubuntu/20.04 --vm vm1
> 
> On affected system, the dmesg output will be flooded with this warning
> message. With patched kernel the VM can be started with clean dmesg.
> 
> I have this kernel tested on Impish, the F-5.13 has been tested by
> Daniƫl Vos (vosdev) on launchpad. Both are working as expected.
> 
> kvm-unit-tests has also been tested on my Impish instance to ensure
> there is no other issues.
> 
> [Where problems could occur]
> This patchset will change how the KVM bug gets reported in the kernel,
> if it's incorrect it might affect VMX capability.
> 
> Paolo Bonzini (1):
>    KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
> 
> Sean Christopherson (2):
>    KVM: Add infrastructure and macro to mark VM as bugged
>    KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
>      VM
> 
>   arch/x86/kvm/svm/svm.c   |  2 +-
>   arch/x86/kvm/vmx/vmx.c   | 60 ++++++++++++++++++++++++++++++------------------
>   arch/x86/kvm/x86.c       |  4 ++++
>   include/linux/kvm_host.h | 28 +++++++++++++++++++++-
>   virt/kvm/kvm_main.c      | 10 ++++----
>   5 files changed, 75 insertions(+), 29 deletions(-)
> 

LGTM. Good test results.

Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>

Thanks




More information about the kernel-team mailing list