[Bug 1704106] Re: [SRU] Gathering of thin provisioning stats breaks c-vol

OpenStack Infra 1704106 at bugs.launchpad.net
Fri May 21 17:16:52 UTC 2021


Reviewed:  https://review.opendev.org/c/openstack/cinder/+/792128
Committed: https://opendev.org/openstack/cinder/commit/93b3747806216cb44b23daf848d70e23fe0d3512
Submitter: "Zuul (22348)"
Branch:    stable/victoria

commit 93b3747806216cb44b23daf848d70e23fe0d3512
Author: Gorka Eguileor <geguileo at redhat.com>
Date:   Wed Nov 4 15:34:37 2020 +0100

    RBD: Change rbd_exclusive_cinder_pool's default
    
    In Cinder we always try to have sane defaults, but the current RBD
    default for rbd_exclusive_cinder_pools may lead to issues on deployments
    with a large number of volumes:
    
    - Cinder taking a long time to start.
    - Cinder becoming non-responsive.
    - Cinder stats gathering taking longer than the gathering period.
    
    This is cause by the driver making an independent request to get
    detailed information on each image to accurately calculate the space
    used by the Cinder volumes.
    
    With this patch we change the default to make sure that these issues
    don't happen in the most common deployment case (the exclusive Cinder
    pool).
    
    Related-Bug: #1704106
    Change-Id: I839441a71238cdad540ba8d9d4d18b1f0fa3ee9d
    (cherry picked from commit 4ba6664deefdec5379d31b987be322745eeb4e87)


** Tags added: in-stable-victoria

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1704106

Title:
  [SRU] Gathering of thin provisioning stats breaks c-vol

Status in OpenStack cinder-ceph charm:
  Fix Released
Status in Cinder:
  Fix Released
Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive ocata series:
  Fix Released
Status in Ubuntu Cloud Archive pike series:
  Fix Released

Bug description:
  [Impact]

  Backport config option added in Queens to allow disabling the
  collection of stats from all rbd volumes since this causes
  tons of non-fatal race conditions and slows down deletes to
  the point where the rpc thread pool fills up blocking further
  requests. Our charms do not configure pool by default and we
  are not aware of anyone doing this in the field so this patch
  enables this option by default.

  [Test Case]

  By default no change in behaviour should occur. To test the
  new feature we need to enable it i.e.:

  * deploy openstack ocata
  * set rbd_exclusive_cinder_pool = true in cinder.conf
  * create 100 volumes via cinder
  * also create 100 volumes from the cinder pool but using the rbd client directly
  * delete cinder volumes (via cinder) and delete the non-cinder rbd volumes using rbd client
  * ensure there are no exceptions in cinder-volume.log

  [Regression Potential]
  The default behaviour is unchanged so no regression is expected.

  ==========================================================================

  The gathering of the thin provisioning stats is done by looping over
  all volumes:

  https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L369

  For larger deployments, this loop (done at start-up, upon volume
  deletion and as periodic a task) is taking too long and hence breaks
  the c-vol service.

  From what I understand, the overall idea of this stats gathering is to
  bring the current real fill status of the pool to the admin's
  attention in case over-subscription was configured. For this, a fill
  status at the pool level (rather than the volume level) should be good
  enough.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-cinder-ceph/+bug/1704106/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list