[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work
Dan Hill
1840348 at bugs.launchpad.net
Fri Apr 10 19:41:44 UTC 2020
** Also affects: ceph (Ubuntu Focal)
Importance: Medium
Assignee: Dan Hill (hillpd)
Status: Triaged
** Also affects: ceph (Ubuntu Bionic)
Importance: Undecided
Status: New
** Also affects: ceph (Ubuntu Eoan)
Importance: Undecided
Status: New
** Changed in: ceph (Ubuntu Bionic)
Status: New => Confirmed
** Changed in: ceph (Ubuntu Bionic)
Assignee: (unassigned) => Dan Hill (hillpd)
** Changed in: ceph (Ubuntu Eoan)
Assignee: (unassigned) => Dan Hill (hillpd)
** Changed in: ceph (Ubuntu Bionic)
Importance: Undecided => Medium
** Changed in: ceph (Ubuntu Eoan)
Importance: Undecided => Medium
** Changed in: ceph (Ubuntu Eoan)
Status: New => Confirmed
** Changed in: ceph (Ubuntu Focal)
Status: Triaged => Confirmed
** Description changed:
[Impact]
The Sharded OpWQ will opportunistically wait for more work when processing an empty queue. While waiting, the heartbeat timeout and suicide_grace values are modified. On Luminous, the `threadpool_default_timeout` grace is left applied and suicide_grace is left disabled. On later releases both the grace and suicide_grace are left disabled.
After finding work, the original work queue grace/suicide_grace values
are not re-applied. This can result in hung operations that do not
trigger an OSD suicide recovery.
The missing suicide recovery was observed on Luminous 12.2.11. The
environment was consistently hitting a known authentication race
condition (issue#37778 [0]) due to repeated OSD service restarts on a
node exhibiting MCEs from a faulty DIMM.
The auth race condition would stall pg operations. In some cases, the
hung ops would persist for hours without suicide recovery.
[Test Case]
- In-Progress -
Haven't landed on a reliable reproducer. Currently testing the fix by exercising I/O. Since the fix applies to all version of Ceph, the plan is to let this bake in the latest release before considering a back-port.
[Regression Potential]
This fix improves suicide_grace coverage of the Sharded OpWq.
This change is made in a critical code path that drives client I/O. An
OSD suicide will trigger a service restart and repeated restarts
(flapping) will adversely impact cluster performance.
The fix mitigates risk by keeping the applied suicide_grace value
consistent with the value applied before entering
`OSD::ShardedOpWQ::_process()`. The fix is also restricted to the empty
queue edge-case that drops the suicide_grace timeout. The suicide_grace
value is only re-applied when work is found after waiting on an empty
queue.
- In-Progress -
- The fix will bake upstream on later levels before back-port consideration.
+ The fix needs to bake upstream on later levels before back-port consideration.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1840348
Title:
Sharded OpWQ drops suicide_grace after waiting for work
Status in ceph package in Ubuntu:
Confirmed
Status in ceph source package in Bionic:
Confirmed
Status in ceph source package in Eoan:
Confirmed
Status in ceph source package in Focal:
Confirmed
Bug description:
[Impact]
The Sharded OpWQ will opportunistically wait for more work when processing an empty queue. While waiting, the heartbeat timeout and suicide_grace values are modified. On Luminous, the `threadpool_default_timeout` grace is left applied and suicide_grace is left disabled. On later releases both the grace and suicide_grace are left disabled.
After finding work, the original work queue grace/suicide_grace values
are not re-applied. This can result in hung operations that do not
trigger an OSD suicide recovery.
The missing suicide recovery was observed on Luminous 12.2.11. The
environment was consistently hitting a known authentication race
condition (issue#37778 [0]) due to repeated OSD service restarts on a
node exhibiting MCEs from a faulty DIMM.
The auth race condition would stall pg operations. In some cases, the
hung ops would persist for hours without suicide recovery.
[Test Case]
- In-Progress -
Haven't landed on a reliable reproducer. Currently testing the fix by exercising I/O. Since the fix applies to all version of Ceph, the plan is to let this bake in the latest release before considering a back-port.
[Regression Potential]
This fix improves suicide_grace coverage of the Sharded OpWq.
This change is made in a critical code path that drives client I/O. An
OSD suicide will trigger a service restart and repeated restarts
(flapping) will adversely impact cluster performance.
The fix mitigates risk by keeping the applied suicide_grace value
consistent with the value applied before entering
`OSD::ShardedOpWQ::_process()`. The fix is also restricted to the
empty queue edge-case that drops the suicide_grace timeout. The
suicide_grace value is only re-applied when work is found after
waiting on an empty queue.
- In-Progress -
The fix needs to bake upstream on later levels before back-port consideration.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list