[ 3.5.y.z extended stable ] Patch "IPoIB: Fix send lockup due to missed TX completion" has been added to staging queue

Luis Henriques luis.henriques at canonical.com
Mon Apr 1 15:04:50 UTC 2013

This is a note to let you know that I have just added a patch titled

    IPoIB: Fix send lockup due to missed TX completion

to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree 
which can be found at:


If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.y.z tree, see



>From 63d853fceaaf095e01e2d4f3f11bf26f9a73be97 Mon Sep 17 00:00:00 2001
From: Mike Marciniszyn <mike.marciniszyn at intel.com>
Date: Tue, 26 Feb 2013 15:46:27 +0000
Subject: [PATCH] IPoIB: Fix send lockup due to missed TX completion

commit 1ee9e2aa7b31427303466776f455d43e5e3c9275 upstream.

Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.

The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.

This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.

This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.

Reviewed-by: Dean Luick <dean.luick at intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn at intel.com>
Signed-off-by: Roland Dreier <roland at purestorage.com>
Signed-off-by: Luis Henriques <luis.henriques at canonical.com>
 drivers/infiniband/ulp/ipoib/ipoib_cm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 014504d..3767853 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -755,9 +755,13 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
 		if (++priv->tx_outstanding == ipoib_sendq_size) {
 			ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
-			if (ib_req_notify_cq(priv->send_cq, IB_CQ_NEXT_COMP))
-				ipoib_warn(priv, "request notify on send CQ failed\n");
+			rc = ib_req_notify_cq(priv->send_cq,
+			if (rc < 0)
+				ipoib_warn(priv, "request notify on send CQ failed\n");
+			else if (rc)
+				ipoib_send_comp_handler(priv->send_cq, dev);

More information about the kernel-team mailing list