[SRU][Trusty][Xenial][PATCH v2] UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

Gavin Guo gavin.guo at canonical.com
Mon Dec 3 15:59:43 UTC 2018


On Mon, Dec 3, 2018 at 9:18 PM Juerg Haefliger
<juerg.haefliger at canonical.com> wrote:
>
> On Fri, 30 Nov 2018 17:44:37 +0800
> Gavin Guo <gavin.guo at canonical.com> wrote:
>
> > BugLink: https://launchpad.net/bugs/1764956
> >
> > Ubuntu v4.4 kernel uses the in-house patches for IBRS. The backports
> > still have some issues causing the IBRS status wrong when
> > context-switching between the VM and host. For example, the IBRS would
> > be mistakenly enabled in the host when the switching from an IBRS-enabled
> > VM and that causes the performance overhead in the host. The other
> > condition could also mistakenly disables the IBRS in VM when
> > context-switching from the host. And this could be considered a CVE host.
> >
> > The detail different situations analysis:
> >
> > The reproducing environment:
> > Guest kernel version: 4.4.0-138.164
> > Host kernel version: 4.4.0-140.166
> >
> > (host IBRS, guest IBRS)
> >
> > - 1). (0, 1).
> > The case can be reproduced by the following instructions:
> > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> > 1
> >
> > <Several minutes later...>
> >
> > host$ cat /proc/sys/kernel/ibrs_enabled
> > 0
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111000000000000000000010010100000000000000000
> >
> > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> > enabled.
> >
> > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> > stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
> > stress-ng: info:  [11264] dispatching hogs: 1 cpu
> > stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> > stress-ng: info:  [11264] successful run completed in 33.48s
> >
> > The host kernel didn't notice the IBRS bit is enabled. So, the situation
> > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
> > And running the stress-ng is a pure userspace CPU capability
> > calculation. So, the performance downgrades to about 1/3. Without the
> > IBRS enabled, it needs about 10s.
> >
> > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
> > The guest IBRS has been mistakenly disabled.
> >
> > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> >
> > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > Fixes: 4d8d3dbed275 ("UBUNTU: SAUCE: x86/bugs, KVM: Support the combination ...")
> > Fixes: f676aa34b402 ("x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD ...")
> > Signed-off-by: Gavin Guo <gavin.guo at canonical.com>
> > ---
> >  arch/x86/kernel/cpu/bugs.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> > index 60907abf12f5..e5f1ba148e3c 100644
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
> > @@ -185,6 +185,13 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
> >               guestval = hostval & ~x86_spec_ctrl_mask;
> >               guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
> >
> > +             /*
> > +              * Check the host IBRS status to make IBRS regsiter update
> > +              * correctly.
> > +              */
> > +             if (ibrs_enabled)
> > +                     hostval |= SPEC_CTRL_IBRS;
>
> This means we're setting IBRS on the host even if ibrs_enabled == 1. I
> don't think that's correct. With ibrs_enabled == 1 we could VMENTER with
> IBRS cleared but with the above we will set it on VMEXIT which is
> incorrect, no?

Good point! The reason why it should be set to one is based on the
definition of the "ibrs_enabled == 1". When setting "ibrs_enabled ==
1", the IBRS is enabled _ONLY_ in the _KERNEL_ mode. In user space,
the IBRS is disabled.

It's possible from the two paths:

i). Boot time
start_kernel -> check_bugs -> spectre_v2_select_mitigation ->
set_ibrs_enabled(1)

ii). Sysctl path
ibrs_enabled_handler -> set_ibrs_enabled(1)

In set_ibrs_enabled(1), the IBRS register isn't set. It just assigns
one to the ibrs_enabled variable.

So, where is the point to enable the IBRS register in kernel mode?

When ibrs_enabled = 1, it won't jump over the __ASM_ENABLE_IBRS, so,
the IBRS is set(for 0/2, it will jump over the __ASM_ENABLE_IBRS to
"10:", then, the IBRS won't set in the ENABLE_IBRS).

.macro ENABLE_IBRS
        testl   $1, ibrs_enabled
        jz      10f
        __ASM_ENABLE_IBRS
        jmp 20f
10:
        lfence
20:
.endm

The ENABLE_IBRS macro is expanded in the point where user space
switches to kernel space, such as:

arch/x86/entry/entry_64.S
1). ENTRY(entry_SYSCALL_64)
2). common_interrupt
3). error_entry
4). ENTRY(nmi)


And DISABLE_IBRS will disable the IBRS register when switching back to
user space from the kernel space to keep the performance in user space.

.macro DISABLE_IBRS
        testl   $1, ibrs_enabled
        jz      9f
        __ASM_DISABLE_IBRS
9:
.endm

Finally, the reason is that before the VMENTRY, it's in the kernel mode,
so, the IBRS register status should be enabled by the above-mentioned
path from the user space to kernel space when "ibrs_enabled == 1". And
SPEC_CTRL IBRS bit should be one. Let assume that if the guest clear the IBRS
register, after the VMEXIT, we should set one back to the SPEC_CTRL
IBRS bit again to re-enable in the kernel mode. Or we can save the
IBRS bit status before the VMENTRY when "ibrs_enabled == 1." But I
think it's redundant. Or do you figure out if there are any
possibilities that in the kernel mode before VMENTRY, the SPEC_CTRL
IBRS bit could be zero?


>
> ...Juerg
>
>
> > +
> >               /* SSBD controlled in MSR_SPEC_CTRL */
> >               if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
> >                       hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
>



More information about the kernel-team mailing list