[Bug 1908375] Re: ceph-volume lvm list <device> calls blkid numerous times for differrent devices

Chris MacNaughton 1908375 at bugs.launchpad.net
Tue Feb 9 14:02:50 UTC 2021


The code in this change looks OK, but it's also a lot of change to
propose backporting to a stable release. Unfortunately, Luminous is EoL
upstream or we could push to land it upstream and bring it in via an
upstream point release which would be a bit safer for us.

Because this is fairly large of a change, I'd like to see a more
comprehensive test case for the SRU to ensure that we aren't regressing
other things by removing the blkid calls. As called out in the
regression potential, things that read the output of ceph-volume may be
impacted, so I'd like to see that covered as well in the test case.

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1908375

Title:
  ceph-volume lvm list <device> calls blkid numerous times for
  differrent devices

Status in ceph package in Ubuntu:
  Fix Released
Status in ceph source package in Bionic:
  In Progress
Status in ceph source package in Focal:
  Fix Released
Status in ceph source package in Groovy:
  Fix Released

Bug description:
  [Impact]

   * Every ceph-volume list lvm <device> call invokes blkid for numerous PARTUUIDs. For some setups with many slower IO devices this can make this call to run for minutes without any actual justification for that.
  In fact, the upstream ceph approach changed in this matter and post-bionic releases already have ceph-volume that does not invoke blkid at all in this context making the call much faster.

  Please examine the attached ceph-volume.log fragment for a ceph-volume
  call, the accumulated blkid calls take around 1 min 7 s.

  
  [Test Case]

   * Setup a ceph-osd with numerous block devices with long access time for blkid.
   * Run
  time ceph-volume --log-path ceph-volume.log --log-level debug lvm list <device>
  on one of them and check the log to see that most of the execution time is consumed by blkid calls.

  [Where problems could occur]

   * Although a potential fix does not introduce any changes to how
  ceph-volume is used any automation depending on ceph-volume log
  parsing may notice a change.

  [Other Info]
   
   * The fix to this is available for Focal and beyond.
   * Xenial is not affected due to lack of ceph-volume in its ceph release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1908375/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list