[ 3.5.y.z extended stable ] Patch "xen/evtchn: avoid a deadlock when unbinding an event channel" has been added to staging queue

Luis Henriques luis.henriques at canonical.com
Sun Aug 4 09:49:29 UTC 2013


This is a note to let you know that I have just added a patch titled

    xen/evtchn: avoid a deadlock when unbinding an event channel

to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree 
which can be found at:

 http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Luis

------

>From 220c8f6347b581d7c8beb7996b87c736222ef9f4 Mon Sep 17 00:00:00 2001
From: David Vrabel <david.vrabel at citrix.com>
Date: Fri, 19 Jul 2013 15:51:58 +0100
Subject: [PATCH] xen/evtchn: avoid a deadlock when unbinding an event channel

commit 179fbd5a45f0d4034cc6fd37b8d367a3b79663c4 upstream.

Unbinding an event channel (either with the ioctl or when the evtchn
device is closed) may deadlock because disable_irq() is called with
port_user_lock held which is also locked by the interrupt handler.

Think of the IOCTL_EVTCHN_UNBIND is being serviced, the routine has
just taken the lock, and an interrupt happens. The evtchn_interrupt
is invoked, tries to take the lock and spins forever.

A quick glance at the code shows that the spinlock is a local IRQ
variant. Unfortunately that does not help as "disable_irq() waits for
the interrupt handler on all CPUs to stop running.  If the irq occurs
on another VCPU, it tries to take port_user_lock and can't because
the unbind ioctl is holding it." (from David). Hence we cannot
depend on the said spinlock to protect us. We could make it a system
wide IRQ disable spinlock but there is a better way.

We can piggyback on the fact that the existence of the spinlock is
to make get_port_user() checks be up-to-date. And we can alter those
checks to not depend on the spin lock (as it's protected by u->bind_mutex
in the ioctl) and can remove the unnecessary locking (this is
IOCTL_EVTCHN_UNBIND) path.

In the interrupt handler we cannot use the mutex, but we do not
need it.

"The unbind disables the irq before making the port user stale, so when
you clear it you are guaranteed that the interrupt handler that might
use that port cannot be running." (from David).

Hence this patch removes the spinlock usage on the teardown path
and piggybacks on disable_irq happening before we muck with the
get_port_user() data. This ensures that the interrupt handler will
never run on stale data.

Signed-off-by: David Vrabel <david.vrabel at citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
[v1: Expanded the commit description a bit]
Cc: Jonghwan Choi <jhbird.choi at gmail.com>
Signed-off-by: Luis Henriques <luis.henriques at canonical.com>
---
 drivers/xen/evtchn.c | 21 ++-------------------
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
index b2db77e..82b27d4 100644
--- a/drivers/xen/evtchn.c
+++ b/drivers/xen/evtchn.c
@@ -377,18 +377,12 @@ static long evtchn_ioctl(struct file *file,
 		if (unbind.port >= NR_EVENT_CHANNELS)
 			break;

-		spin_lock_irq(&port_user_lock);
-
 		rc = -ENOTCONN;
-		if (get_port_user(unbind.port) != u) {
-			spin_unlock_irq(&port_user_lock);
+		if (get_port_user(unbind.port) != u)
 			break;
-		}

 		disable_irq(irq_from_evtchn(unbind.port));

-		spin_unlock_irq(&port_user_lock);
-
 		evtchn_unbind_from_user(u, unbind.port);

 		rc = 0;
@@ -488,26 +482,15 @@ static int evtchn_release(struct inode *inode, struct file *filp)
 	int i;
 	struct per_user_data *u = filp->private_data;

-	spin_lock_irq(&port_user_lock);
-
-	free_page((unsigned long)u->ring);
-
 	for (i = 0; i < NR_EVENT_CHANNELS; i++) {
 		if (get_port_user(i) != u)
 			continue;

 		disable_irq(irq_from_evtchn(i));
-	}
-
-	spin_unlock_irq(&port_user_lock);
-
-	for (i = 0; i < NR_EVENT_CHANNELS; i++) {
-		if (get_port_user(i) != u)
-			continue;
-
 		evtchn_unbind_from_user(get_port_user(i), i);
 	}

+	free_page((unsigned long)u->ring);
 	kfree(u->name);
 	kfree(u);

--
1.8.3.2





More information about the kernel-team mailing list