[SRU][Mantic][PATCH 0/1] kvm: Running perf against qemu processes results in page fault inside guest

Matthew Ruffell matthew.ruffell at canonical.com
Sun Feb 18 08:19:26 UTC 2024


BugLink: https://bugs.launchpad.net/bugs/2054218

[Impact]

Running perf against a QEMU/kvm process results in the guest suffering a page
fault in trying to store Precise Event Based Sampling (PEBS) records for the 
host. This affects both using perf against a single process, in which it crashes
the targeted guest, or using perf system wide, in which it crashes all running
guests on the system.

The issue was introduced in 6.0 by:

commit c59a1f106f5cd4843c097069ff1bb2ad72103a67
Author: Like Xu <like.xu at linux.intel.com>
Date:   Mon Apr 11 18:19:36 2022 +0800
Subject: KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c59a1f106f5cd4843c097069ff1bb2ad72103a67

This affects all 6.2 and 6.5 kernels. There is no known workaround, apart from
not using perf on affected systems.

[Fix]

The issue was fixed in 6.7 by:

commit 971079464001c6856186ca137778e534d983174a
Author: Paolo Bonzini <pbonzini at redhat.com>
Date:   Thu Jan 4 16:15:17 2024 +0100
Subject: KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=971079464001c6856186ca137778e534d983174a

This reinstates the logic for setting MSR_CORE_PERF_GLOBAL_CTRL to what it was
before "KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS".

-               .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
+               .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,

The faulty logic includes any bit that isn't both marked as exclude_guest and
using PEBS, while it should really be excluding PEBS from the host.

[Testcase]

Start a bare metal server. Enable KVM, start a few VMs. The VMs can be idle,
they don't require any workload.

$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils uvtool
$ sudo reboot
$ ssh-keygen
$ uvt-simplestreams-libvirt sync --source http://cloud-images.ubuntu.com/daily release=jammy arch=amd64
$ uvt-kvm create --cpu 4 --memory 4096 --disk 10 jammy-a release=jammy arch=amd64
$ uvt-kvm create --cpu 4 --memory 4096 --disk 10 jammy-b release=jammy arch=amd64
$ uvt-kvm create --cpu 4 --memory 4096 --disk 10 jammy-c release=jammy arch=amd64
$ virsh list
 Id   Name      State
-------------------------
 2    jammy-a   running
 3    jammy-b   running
 4    jammy-c   running
$ uvt-kvm ssh jammy-a
Check it works.
$ ps aux | grep qemu
Find the pid of jammy-a
$ perf top -p $PID
$ virsh console jammy-a
Escape character is ^] (Ctrl + ])
[  357.793039] BUG: unable to handle page fault for address: fffffe49178c6028
$ uvt-kvm ssh jammy-a
(no response)

Test packages are available in the following ppa:

https://launchpad.net/~mruffell/+archive/ubuntu/sf379502-test

If you install it, then running perf against the PID of qemu processes will no
longer crash the guest, and they will be accessible by SSH afterward.

[Where problems could occur]

We are rearranging the logic of setting the PEBS MSRs, which affects processor
sampling of events. This will affect any profiling tools running against KVM
based virtual machines, namely perf against QEMU.

If a regression were to occur, running perf against a VM could cause it to
page fault and subsequently crash, resulting in downtime.

The only workaround will be to disable all profiling tools until a fix is
available.

Paolo Bonzini (1):
  KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL

 arch/x86/events/intel/core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

-- 
2.40.1




More information about the kernel-team mailing list