[ 3.5.yuz extended stable ] Patch "libceph: resubmit linger ops when pg mapping changes" has been added to staging queue

Herton Ronaldo Krzesinski herton.krzesinski at canonical.com
Tue Nov 20 17:18:09 UTC 2012


This is a note to let you know that I have just added a patch titled

    libceph: resubmit linger ops when pg mapping changes

to the linux-3.5.y-queue branch of the 3.5.yuz extended stable tree 
which can be found at:

 http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.yuz tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Herton

------

>From d52f6da07250063845dd510204a5fcc7016cfd3e Mon Sep 17 00:00:00 2001
From: Sage Weil <sage at inktank.com>
Date: Mon, 30 Jul 2012 16:19:28 -0700
Subject: [PATCH 50/78] libceph: resubmit linger ops when pg mapping changes

commit 6194ea895e447fdf4adfd23f67873a32bf4f15ae upstream.

The linger op registration (i.e., watch) modifies the object state.  As
such, the OSD will reply with success if it has already applied without
doing the associated side-effects (setting up the watch session state).
If we lose the ACK and resubmit, we will see success but the watch will not
be correctly registered and we won't get notifies.

To fix this, always resubmit the linger op with a new tid.  We accomplish
this by re-registering as a linger (i.e., 'registered') if we are not yet
registered.  Then the second loop will treat this just like a normal
case of re-registering.

This mirrors a similar fix on the userland ceph.git, commit 5dd68b95, and
ceph bug #2796.

Signed-off-by: Sage Weil <sage at inktank.com>
Reviewed-by: Alex Elder <elder at inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda at inktank.com>
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski at canonical.com>
---
 net/ceph/osd_client.c |   26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 4475d17..752b498 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -888,7 +888,9 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
 {
 	dout("__register_linger_request %p\n", req);
 	list_add_tail(&req->r_linger_item, &osdc->req_linger);
-	list_add_tail(&req->r_linger_osd, &req->r_osd->o_linger_requests);
+	if (req->r_osd)
+		list_add_tail(&req->r_linger_osd,
+			      &req->r_osd->o_linger_requests);
 }

 static void __unregister_linger_request(struct ceph_osd_client *osdc,
@@ -1302,8 +1304,9 @@ static void kick_requests(struct ceph_osd_client *osdc, int force_resend)

 	dout("kick_requests %s\n", force_resend ? " (force resend)" : "");
 	mutex_lock(&osdc->request_mutex);
-	for (p = rb_first(&osdc->requests); p; p = rb_next(p)) {
+	for (p = rb_first(&osdc->requests); p; ) {
 		req = rb_entry(p, struct ceph_osd_request, r_node);
+		p = rb_next(p);
 		err = __map_request(osdc, req, force_resend);
 		if (err < 0)
 			continue;  /* error */
@@ -1311,10 +1314,23 @@ static void kick_requests(struct ceph_osd_client *osdc, int force_resend)
 			dout("%p tid %llu maps to no osd\n", req, req->r_tid);
 			needmap++;  /* request a newer map */
 		} else if (err > 0) {
-			dout("%p tid %llu requeued on osd%d\n", req, req->r_tid,
-			     req->r_osd ? req->r_osd->o_osd : -1);
-			if (!req->r_linger)
+			if (!req->r_linger) {
+				dout("%p tid %llu requeued on osd%d\n", req,
+				     req->r_tid,
+				     req->r_osd ? req->r_osd->o_osd : -1);
 				req->r_flags |= CEPH_OSD_FLAG_RETRY;
+			}
+		}
+		if (req->r_linger && list_empty(&req->r_linger_item)) {
+			/*
+			 * register as a linger so that we will
+			 * re-submit below and get a new tid
+			 */
+			dout("%p tid %llu restart on osd%d\n",
+			     req, req->r_tid,
+			     req->r_osd ? req->r_osd->o_osd : -1);
+			__register_linger_request(osdc, req);
+			__unregister_request(osdc, req);
 		}
 	}

--
1.7.9.5





More information about the kernel-team mailing list