[Bug 1704106] Re: [SRU] Gathering of thin provisioning stats breaks c-vol
Corey Bryant
corey.bryant at canonical.com
Thu Mar 28 20:29:50 UTC 2019
This is fixed in cinder 11.2.1 point release for pike so we might as
well just pick this up in a new round of point releases. We'll do that
via this bug: https://bugs.launchpad.net/cloud-archive/+bug/1822192
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1704106
Title:
[SRU] Gathering of thin provisioning stats breaks c-vol
Status in Cinder:
Fix Released
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive ocata series:
Triaged
Status in Ubuntu Cloud Archive pike series:
Triaged
Bug description:
[Impact]
Backport config option added in Queens to allow disabling the
collection of stats from all rbd volumes since this causes
tons of non-fatal race conditions and slows down deletes to
the point where the rpc thread pool fills up blocking further
requests. Our charms do not configure pool by default and we
are not aware of anyone doing this in the field so this patch
enables this option by default.
[Test Case]
By default no change in behaviour should occur. To test the
new feature we need to enable it i.e.:
* deploy openstack ocata
* set rbd_exclusive_cinder_pool = true in cinder.conf
* create 100 volumes via cinder
* also create 100 volumes from the cinder pool but using the rbd client directly
* delete cinder volumes (via cinder) and delete the non-cinder rbd volumes using rbd client
* ensure there are no exceptions in cinder-volume.log
[Regression Potential]
The default behaviour is unchanged so no regression is expected.
==========================================================================
The gathering of the thin provisioning stats is done by looping over
all volumes:
https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L369
For larger deployments, this loop (done at start-up, upon volume
deletion and as periodic a task) is taking too long and hence breaks
the c-vol service.
From what I understand, the overall idea of this stats gathering is to
bring the current real fill status of the pool to the admin's
attention in case over-subscription was configured. For this, a fill
status at the pool level (rather than the volume level) should be good
enough.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1704106/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list