NACK: [SRU][Xenial][PULL] Guests using IBRS incur a large performance penalty (LP: #1764956)

Juerg Haefliger juerg.haefliger at canonical.com
Tue Jan 15 07:29:05 UTC 2019


The update to stable 4.4.168 brings in a lot of the commits from this PR so
nacking it. Will send a fix for 1764956 after all the current stable updates
are applied.

...Juerg


On Wed, 19 Dec 2018 11:03:19 +0100
Juerg Haefliger <juerg.haefliger at canonical.com> wrote:

> [Impact]
> the IBRS would be mistakenly enabled in the host when the switching
> from an IBRS-enabled VM and that causes the performance overhead in
> the host. The other condition could also mistakenly disables the IBRS
> in VM when context-switching from the host. And this could be
> considered a CVE host.
> 
> [Fix]
> The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
> the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
> x86_spec_ctrl_base by default is zero. Because the upstream
> implementation is not equal to the Xenial's implementation. Upstream
> doesn't use the IBRS as the formal fix. So, by default, it's zero.
> 
> On the other hand, after the VM exit, the SPEC_CTRL register also
> needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
> intercept is disabled by default in the hardware_setup(v4.4) and
> vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
> doesn't trigger a trap. So, the vmx_set_msr() function isn't called.
> 
> The v3.13 kernel hasn't been tested. However, the patch can be viewed
> at:
> http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru
> 
> The v4.4 patch:
> http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg
> 
> [Test]
> 
> The patch has been tested on the 4.4.0-140.166 and works fine.
> 
> The reproducing environment:
> Guest kernel version: 4.4.0-138.164
> Host kernel version: 4.4.0-140.166
> 
> (host IBRS, guest IBRS)
> 
> - 1). (0, 1).
> The case can be reproduced by the following instructions:
> guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> 1
> 
> <Several minutes later...>
> 
> host$ cat /proc/sys/kernel/ibrs_enabled
> 0
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111000000000000000000010010100000000000000000
> 
> Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> enabled.
> 
> host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
> stress-ng: info:  [11264] dispatching hogs: 1 cpu
> stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> stress-ng: info:  [11264] successful run completed in 33.48s
> 
> The host kernel didn't notice the IBRS bit is enabled. So, the situation
> is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
> And running the stress-ng is a pure userspace CPU capability
> calculation. So, the performance downgrades to about 1/3. Without the
> IBRS enabled, it needs about 10s.
> 
> - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
> The guest IBRS has been mistakenly disabled.
> 
> guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
> 
> host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
> host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
> 
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
> 
> 
> [juergh: MSR-isolation between guests and the host is incomplete in
>  Xenial. This PR is supposed to fix this and bring Xenial up to par with
>  stable v4.9.]
> 
> Signed-off-by: Juerg Haefliger <juergh at canonical.com>
> ---
> 
> The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529:
> 
>   UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2
> 
> for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622:
> 
>   UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100)
> 
> ----------------------------------------------------------------
> Ashok Raj (1):
>       KVM/x86: Add IBPB support
> 
> David Matlack (1):
>       KVM: nVMX: mark vmcs12 pages dirty on L2 exit
> 
> Jim Mattson (5):
>       kvm: nVMX: VMCLEAR an active shadow VMCS after last use
>       kvm: vmx: Scrub hardware GPRs at VM-exit
>       KVM: nVMX: Eliminate vmcs02 pool
>       kvm: x86: IA32_ARCH_CAPABILITIES is always supported
>       kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
> 
> Juerg Haefliger (4):
>       UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
>       UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic
>       UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent
>       UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT
> 
> KarimAllah Ahmed (3):
>       KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
>       KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
>       X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs
> 
> Paolo Bonzini (5):
>       KVM: VMX: introduce alloc_loaded_vmcs
>       KVM: VMX: make MSR bitmaps per-VCPU
>       KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
>       KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
>       KVM: VMX: fixes for vmentry_l1d_flush module parameter
> 
> Radim Krčmář (1):
>       KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
> 
> Thomas Gleixner (2):
>       KVM: SVM: Move spec control call after restore of GS
>       KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
> 
> Tom Lendacky (1):
>       KVM: SVM: Add MSR-based feature support for serializing LFENCE
> 
> Wanpeng Li (1):
>       KVM: X86: Allow userspace to define the microcode version
> 
>  arch/x86/include/asm/kvm_host.h |   1 +
>  arch/x86/kernel/cpu/bugs.c      |   4 +
>  arch/x86/kvm/cpuid.c            |  25 +-
>  arch/x86/kvm/cpuid.h            |  74 ++--
>  arch/x86/kvm/svm.c              | 209 +++++++++--
>  arch/x86/kvm/vmx.c              | 777 ++++++++++++++++++++++------------------
>  arch/x86/kvm/x86.c              |  12 +-
>  7 files changed, 691 insertions(+), 411 deletions(-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20190115/dba9ed5f/attachment.sig>


More information about the kernel-team mailing list