[PATCH 2/2] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

Tyler Hicks tyhicks at canonical.com
Thu Apr 5 06:01:09 UTC 2018

CVE-2017-5715 (Spectre v2 Intel)

Flush indirect branches when switching into a process that marked itself
non dumpable. This protects high value processes like gpg better,
without having too high performance overhead.

If done naïvely, we could switch to a kernel idle thread and then back
to the original process, such as:

    process A -> idle -> process A

In such scenario, we do not have to do IBPB here even though the process
is non-dumpable, as we are switching back to the same process after a

To avoid the redundant IBPB, which is expensive, we track the last mm
user context ID. The cost is to have an extra u64 mm context id to track
the last mm we were using before switching to the init_mm used by idle.
Avoiding the extra IBPB is probably worth the extra memory for this
common scenario.

For those cases where tlb_defer_switch_to_init_mm() returns true (non
PCID), lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So IBPB will be
skipped for this case.

Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.

Signed-off-by: Tim Chen <tim.c.chen at linux.intel.com>
Signed-off-by: David Woodhouse <dwmw at amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
Cc: ak at linux.intel.com
Cc: karahmed at amazon.de
Cc: arjan at linux.intel.com
Cc: torvalds at linux-foundation.org
Cc: linux at dominikbrodowski.net
Cc: peterz at infradead.org
Cc: bp at alien8.de
Cc: luto at kernel.org
Cc: pbonzini at redhat.com
Cc: gregkh at linux-foundation.org
Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk
(backported from commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7)
[tyhicks: Dropped the enhancement that tracked the last mm user context ID]
[tyhicks: Only use IBPB when the prev and next mm's are different]
Signed-off-by: Tyler Hicks <tyhicks at canonical.com>
 arch/x86/mm/tlb.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 0794ff7..16e7baf 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -100,9 +100,6 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 	unsigned cpu = smp_processor_id();
-	if (ibpb_inuse && boot_cpu_has(X86_FEATURE_SPEC_CTRL))
-		native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
 	if (likely(prev != next)) {
 		this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
 		this_cpu_write(cpu_tlbstate.active_mm, next);
@@ -140,6 +137,25 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 		/* Stop flush ipis for the previous mm */
 		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+		/*
+		 * Avoid user/user BTB poisoning by flushing the branch
+		 * predictor when switching between processes. This stops
+		 * one process from doing Spectre-v2 attacks on another.
+		 *
+		 * As an optimization, flush indirect branches only when
+		 * switching into processes that disable dumping. This
+		 * protects high value processes like gpg, without having
+		 * too high performance overhead. IBPB is *expensive*!
+		 *
+		 * This will not flush branches when switching into kernel
+		 * threads. It will flush if we switch to a different
+		 * non-dumpable process.
+		 */
+		if (tsk && tsk->mm &&
+		    get_dumpable(tsk->mm) != SUID_DUMP_USER &&
+		    ibpb_inuse && boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+			native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
 		/* Load the LDT, if the LDT is different: */
 		if (unlikely(prev->context.ldt != next->context.ldt))

More information about the kernel-team mailing list