[SRU][Xenial][PULL] Guests using IBRS incur a large performance penalty (LP: #1764956)

Juerg Haefliger juerg.haefliger at canonical.com
Tue Jan 8 11:33:47 UTC 2019


On Mon, 7 Jan 2019 19:26:55 +0100
Stefan Bader <stefan.bader at canonical.com> wrote:

> On 19.12.18 11:03, Juerg Haefliger wrote:
> > [Impact]
> > the IBRS would be mistakenly enabled in the host when the switching
> > from an IBRS-enabled VM and that causes the performance overhead in
> > the host. The other condition could also mistakenly disables the IBRS
> > in VM when context-switching from the host. And this could be
> > considered a CVE host.
> > 
> > [Fix]
> > The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
> > the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
> > x86_spec_ctrl_base by default is zero. Because the upstream
> > implementation is not equal to the Xenial's implementation. Upstream
> > doesn't use the IBRS as the formal fix. So, by default, it's zero.
> > 
> > On the other hand, after the VM exit, the SPEC_CTRL register also
> > needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
> > intercept is disabled by default in the hardware_setup(v4.4) and
> > vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
> > doesn't trigger a trap. So, the vmx_set_msr() function isn't called.
> > 
> > The v3.13 kernel hasn't been tested. However, the patch can be viewed
> > at:
> > http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru
> > 
> > The v4.4 patch:
> > http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg
> > 
> > [Test]
> > 
> > The patch has been tested on the 4.4.0-140.166 and works fine.
> > 
> > The reproducing environment:
> > Guest kernel version: 4.4.0-138.164
> > Host kernel version: 4.4.0-140.166
> > 
> > (host IBRS, guest IBRS)
> > 
> > - 1). (0, 1).
> > The case can be reproduced by the following instructions:
> > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> > 1
> > 
> > <Several minutes later...>
> > 
> > host$ cat /proc/sys/kernel/ibrs_enabled
> > 0
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111000000000000000000010010100000000000000000
> > 
> > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> > enabled.
> > 
> > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> > stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
> > stress-ng: info:  [11264] dispatching hogs: 1 cpu
> > stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> > stress-ng: info:  [11264] successful run completed in 33.48s
> > 
> > The host kernel didn't notice the IBRS bit is enabled. So, the situation
> > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
> > And running the stress-ng is a pure userspace CPU capability
> > calculation. So, the performance downgrades to about 1/3. Without the
> > IBRS enabled, it needs about 10s.
> > 
> > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
> > The guest IBRS has been mistakenly disabled.
> > 
> > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> > 
> > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> > 
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> > 
> > 
> > [juergh: MSR-isolation between guests and the host is incomplete in
> >  Xenial. This PR is supposed to fix this and bring Xenial up to par with
> >  stable v4.9.]
> > 
> > Signed-off-by: Juerg Haefliger <juergh at canonical.com>  
> 
> Just for sanity checking: this pull request (hate hate hate)

? The PR itself, or the fact that it's a PR, or the submitter of the PR?


> replaces the
> submitted patch, right?

Yes.

...Juerg


> 
> -Stefan
> > ---
> > 
> > The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529:
> > 
> >   UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100)
> > 
> > are available in the Git repository at:
> > 
> >   git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2
> > 
> > for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622:
> > 
> >   UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100)
> > 
> > ----------------------------------------------------------------
> > Ashok Raj (1):
> >       KVM/x86: Add IBPB support
> > 
> > David Matlack (1):
> >       KVM: nVMX: mark vmcs12 pages dirty on L2 exit
> > 
> > Jim Mattson (5):
> >       kvm: nVMX: VMCLEAR an active shadow VMCS after last use
> >       kvm: vmx: Scrub hardware GPRs at VM-exit
> >       KVM: nVMX: Eliminate vmcs02 pool
> >       kvm: x86: IA32_ARCH_CAPABILITIES is always supported
> >       kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
> > 
> > Juerg Haefliger (4):
> >       UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
> >       UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic
> >       UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent
> >       UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT
> > 
> > KarimAllah Ahmed (3):
> >       KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
> >       KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
> >       X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs
> > 
> > Paolo Bonzini (5):
> >       KVM: VMX: introduce alloc_loaded_vmcs
> >       KVM: VMX: make MSR bitmaps per-VCPU
> >       KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
> >       KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
> >       KVM: VMX: fixes for vmentry_l1d_flush module parameter
> > 
> > Radim Krčmář (1):
> >       KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
> > 
> > Thomas Gleixner (2):
> >       KVM: SVM: Move spec control call after restore of GS
> >       KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
> > 
> > Tom Lendacky (1):
> >       KVM: SVM: Add MSR-based feature support for serializing LFENCE
> > 
> > Wanpeng Li (1):
> >       KVM: X86: Allow userspace to define the microcode version
> > 
> >  arch/x86/include/asm/kvm_host.h |   1 +
> >  arch/x86/kernel/cpu/bugs.c      |   4 +
> >  arch/x86/kvm/cpuid.c            |  25 +-
> >  arch/x86/kvm/cpuid.h            |  74 ++--
> >  arch/x86/kvm/svm.c              | 209 +++++++++--
> >  arch/x86/kvm/vmx.c              | 777 ++++++++++++++++++++++------------------
> >  arch/x86/kvm/x86.c              |  12 +-
> >  7 files changed, 691 insertions(+), 411 deletions(-)
> >   
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20190108/c1468f06/attachment-0001.sig>


More information about the kernel-team mailing list