[ 3.5.y.z extended stable ] Patch "workqueue: cond_resched() after processing each work item" has been added to staging queue

Luis Henriques luis.henriques at canonical.com
Wed Oct 2 15:03:52 UTC 2013


This is a note to let you know that I have just added a patch titled

    workqueue: cond_resched() after processing each work item

to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree 
which can be found at:

 http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Luis

------

>From 263c4396163905e4e7a4eb7f28d16b706c966a83 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj at kernel.org>
Date: Wed, 28 Aug 2013 17:33:37 -0400
Subject: [PATCH] workqueue: cond_resched() after processing each work item

commit b22ce2785d97423846206cceec4efee0c4afd980 upstream.

If !PREEMPT, a kworker running work items back to back can hog CPU.
This becomes dangerous when a self-requeueing work item which is
waiting for something to happen races against stop_machine.  Such
self-requeueing work item would requeue itself indefinitely hogging
the kworker and CPU it's running on while stop_machine would wait for
that CPU to enter stop_machine while preventing anything else from
happening on all other CPUs.  The two would deadlock.

Jamie Liu reports that this deadlock scenario exists around
scsi_requeue_run_queue() and libata port multiplier support, where one
port may exclude command processing from other ports.  With the right
timing, scsi_requeue_run_queue() can end up requeueing itself trying
to execute an IO which is asked to be retried while another device has
an exclusive access, which in turn can't make forward progress due to
stop_machine.

Fix it by invoking cond_resched() after executing each work item.

Signed-off-by: Tejun Heo <tj at kernel.org>
Reported-by: Jamie Liu <jamieliu at google.com>
References: http://thread.gmane.org/gmane.linux.kernel/1552567
Signed-off-by: Luis Henriques <luis.henriques at canonical.com>
---
 kernel/workqueue.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0395ca8..89743ae 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1920,6 +1920,15 @@ __acquires(&gcwq->lock)
 		dump_stack();
 	}

+	/*
+	 * The following prevents a kworker from hogging CPU on !PREEMPT
+	 * kernels, where a requeueing work item waiting for something to
+	 * happen could deadlock with stop_machine as such work item could
+	 * indefinitely requeue itself while all other CPUs are trapped in
+	 * stop_machine.
+	 */
+	cond_resched();
+
 	spin_lock_irq(&gcwq->lock);

 	/* clear cpu intensive status */
--
1.8.3.2





More information about the kernel-team mailing list