[3.16.y-ckt stable] Patch "x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection" has been added to staging queue

Luis Henriques luis.henriques at canonical.com
Tue Aug 11 12:55:45 UTC 2015


This is a note to let you know that I have just added a patch titled

    x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection

to the linux-3.16.y-queue branch of the 3.16.y-ckt extended stable tree 
which can be found at:

    http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-3.16.y-queue

This patch is scheduled to be released in version 3.16.7-ckt16.

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.16.y-ckt tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Luis

------

>From 84b6f86649f5e84d2619c569ea0d3dc88d47d4ad Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto at kernel.org>
Date: Wed, 15 Jul 2015 10:29:38 -0700
Subject: x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI
 detection

commit 810bc075f78ff2c221536eb3008eac6a492dba2d upstream.

We have a tricky bug in the nested NMI code: if we see RSP
pointing to the NMI stack on NMI entry from kernel mode, we
assume that we are executing a nested NMI.

This isn't quite true.  A malicious userspace program can point
RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to
happen while RSP is still pointing at the NMI stack.

Fix it with a sneaky trick.  Set DF in the region of code that
the RSP check is intended to detect.  IRET will clear DF
atomically.

( Note: other than paravirt, there's little need for all this
  complexity. We could check RIP instead of RSP. )

Signed-off-by: Andy Lutomirski <luto at kernel.org>
Reviewed-by: Steven Rostedt <rostedt at goodmis.org>
Cc: Borislav Petkov <bp at suse.de>
Cc: Linus Torvalds <torvalds at linux-foundation.org>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: Thomas Gleixner <tglx at linutronix.de>
Signed-off-by: Ingo Molnar <mingo at kernel.org>
[bwh: Backported to 4.0: adjust filename, context]
Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
[ luis: backported to 3.16: Used Ben's backport to 4.0 ]
Signed-off-by: Luis Henriques <luis.henriques at canonical.com>
---
 arch/x86/kernel/entry_64.S | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 8a6866b2f2a6..79565bd80cc2 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1613,7 +1613,14 @@ ENTRY(nmi)
 	/*
 	 * Now test if the previous stack was an NMI stack.  This covers
 	 * the case where we interrupt an outer NMI after it clears
-	 * "NMI executing" but before IRET.
+	 * "NMI executing" but before IRET.  We need to be careful, though:
+	 * there is one case in which RSP could point to the NMI stack
+	 * despite there being no NMI active: naughty userspace controls
+	 * RSP at the very beginning of the SYSCALL targets.  We can
+	 * pull a fast one on naughty userspace, though: we program
+	 * SYSCALL to mask DF, so userspace cannot cause DF to be set
+	 * if it controls the kernel's RSP.  We set DF before we clear
+	 * "NMI executing".
 	 */
 	lea	6*8(%rsp), %rdx
 	/* Compare the NMI stack (rdx) with the stack we came from (4*8(%rsp)) */
@@ -1624,10 +1631,16 @@ ENTRY(nmi)
 	cmpq	%rdx, 4*8(%rsp)
 	/* If it is below the NMI stack, it is a normal NMI */
 	jb	first_nmi
-	/* Ah, it is within the NMI stack, treat it as nested */
+
+	/* Ah, it is within the NMI stack. */
+
+	testb	$(X86_EFLAGS_DF >> 8), (3*8 + 1)(%rsp)
+	jz	first_nmi	/* RSP was user controlled. */

 	CFI_REMEMBER_STATE

+	/* This is a nested NMI. */
+
 nested_nmi:
 	/*
 	 * Modify the "iret" frame to point to repeat_nmi, forcing another
@@ -1739,8 +1752,16 @@ nmi_restore:

 	RESTORE_ALL 6*8

-	/* Clear "NMI executing". */
-	movq $0, 5*8(%rsp)
+	/*
+	 * Clear "NMI executing".  Set DF first so that we can easily
+	 * distinguish the remaining code between here and IRET from
+	 * the SYSCALL entry and exit paths.  On a native kernel, we
+	 * could just inspect RIP, but, on paravirt kernels,
+	 * INTERRUPT_RETURN can translate into a jump into a
+	 * hypercall page.
+	 */
+	std
+	movq	$0, 5*8(%rsp)		/* clear "NMI executing" */

 	/*
 	 * INTERRUPT_RETURN reads the "iret" frame and exits the NMI




More information about the kernel-team mailing list