[Bug 2013960] Re: Recovery operation takes high priority than client I/O with mclock scheduler
Ponnuvel Palaniyappan
2013960 at bugs.launchpad.net
Tue Apr 4 17:11:13 UTC 2023
** Description changed:
- Starting with Quincy, the mclock_scheduler is used as default. However,
- the default recovery settings are very high that it the impact on client
- I/O can be really high depending on the amount of recovery operations
- needed to be done.
-
- Affects Quincy only.
+ Starting with Quincy, the mclock_scheduler is used as default for OSD op queue. However, the default recovery settings are very high that it the impact on client I/O can be really high depending on the amount of recovery operations needed to be done. This is a bug and has been fixed
+ in 'main' branch and backported to Quincy [0][1].
There's no upstream Quincy release with this fix yet.
17.2.6 will have this fix which is undergoing QA at the moment.
- Upstream bug: https://tracker.ceph.com/issues/57529
- Upstream fix: https://github.com/ceph/ceph/pull/48226
+
+ Workaround:
+
+ There are couple of ways this can be mitigated in Quincy.
+
+ 1. Use the 'wpq' as osd_op_queue. This has been the default in previous
+ releases and works just fine. This will require restarting OSDs.
+
+ 2. Stick with mclock scheduler but use custom mclock profile. This will allow users to be modify recovery parameters.
+ ```
+ osd_mclock_scheduler_background_recovery_res
+ osd_mclock_scheduler_background_recovery_wgt
+ osd_mclock_scheduler_background_recovery_lim
+ ```
+ To be able to use this option, 17.2.4 or later is required due to another
+ bug [2].
+
+ NB: This affects Quincy release only. Older (pacific, octopus, et all) use
+ 'wpq' and as much the recovery parameters can be modified as usual. Only
+ starting from Quincy this has changed.
+
+ [0] https://tracker.ceph.com/issues/57529
+ [1] https://github.com/ceph/ceph/pull/48226
+ [2] https://tracker.ceph.com/issues/55153
** Description changed:
Starting with Quincy, the mclock_scheduler is used as default for OSD op queue. However, the default recovery settings are very high that it the impact on client I/O can be really high depending on the amount of recovery operations needed to be done. This is a bug and has been fixed
in 'main' branch and backported to Quincy [0][1].
There's no upstream Quincy release with this fix yet.
17.2.6 will have this fix which is undergoing QA at the moment.
-
Workaround:
There are couple of ways this can be mitigated in Quincy.
1. Use the 'wpq' as osd_op_queue. This has been the default in previous
releases and works just fine. This will require restarting OSDs.
- 2. Stick with mclock scheduler but use custom mclock profile. This will allow users to be modify recovery parameters.
+ 2. Stick with mclock scheduler but use custom mclock profile. This will allow users to be modify recovery parameters.
```
osd_mclock_scheduler_background_recovery_res
osd_mclock_scheduler_background_recovery_wgt
osd_mclock_scheduler_background_recovery_lim
```
- To be able to use this option, 17.2.4 or later is required due to another
- bug [2].
+ To be able to use this option, 17.2.4 or later is required due to another
+ bug [2]. So probably it's both simpler to stick with 'wpq' until the fix for [0] is available or 17.2.6 is out.
NB: This affects Quincy release only. Older (pacific, octopus, et all) use
'wpq' and as much the recovery parameters can be modified as usual. Only
starting from Quincy this has changed.
+
[0] https://tracker.ceph.com/issues/57529
[1] https://github.com/ceph/ceph/pull/48226
[2] https://tracker.ceph.com/issues/55153
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2013960
Title:
Recovery operation takes high priority than client I/O with mclock
scheduler
Status in ceph package in Ubuntu:
New
Bug description:
Starting with Quincy, the mclock_scheduler is used as default for OSD op queue. However, the default recovery settings are very high that it the impact on client I/O can be really high depending on the amount of recovery operations needed to be done. This is a bug and has been fixed
in 'main' branch and backported to Quincy [0][1].
There's no upstream Quincy release with this fix yet.
17.2.6 will have this fix which is undergoing QA at the moment.
Workaround:
There are couple of ways this can be mitigated in Quincy.
1. Use the 'wpq' as osd_op_queue. This has been the default in
previous releases and works just fine. This will require restarting
OSDs.
2. Stick with mclock scheduler but use custom mclock profile. This will allow users to be modify recovery parameters.
```
osd_mclock_scheduler_background_recovery_res
osd_mclock_scheduler_background_recovery_wgt
osd_mclock_scheduler_background_recovery_lim
```
To be able to use this option, 17.2.4 or later is required due to another
bug [2]. So probably it's both simpler to stick with 'wpq' until the fix for [0] is available or 17.2.6 is out.
NB: This affects Quincy release only. Older (pacific, octopus, et all) use
'wpq' and as much the recovery parameters can be modified as usual. Only
starting from Quincy this has changed.
[0] https://tracker.ceph.com/issues/57529
[1] https://github.com/ceph/ceph/pull/48226
[2] https://tracker.ceph.com/issues/55153
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2013960/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list