[Bug 1863704] Re: wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES and CEPH_VOLUME_SYSTEMD_INTERVAL

Edward Hope-Morley edward.hope-morley at canonical.com
Tue Feb 18 09:47:35 UTC 2020


@taodd can you please tell me which releases of Ubuntu Ceph this patch
already exists in and tell me which releases you are targeting this SRU
at. You have set Bionic but is it already in Focal, Eoan etc?

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1863704

Title:
  wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES
  and CEPH_VOLUME_SYSTEMD_INTERVAL

Status in Ubuntu Cloud Archive:
  New
Status in ceph package in Ubuntu:
  New

Bug description:
  [Impact]
  The impact is that we can't manually change the value of env CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL. 
  The default value will make the ceph-volume-systemd only keep trying about 150 seconds, which is not enough under some circumstance that the cluter needs longer time to activate the disks (e.g disk was encrypted by vault, and this can happen when vault and osd nodes are restarted).
  So, in this case, user will need to increase the value of CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL to keep ceph-volume-systemd to try longer to be able to start the osd.

  [Test case]
  1. deploy a ceph cluster (we can use luminous)
  2. set env CEPH_VOLUME_SYSTEMD_TRIES to 100 on one osd node
  3. debug the ceph-volume-systemd at file src/ceph-volume/ceph_volume/systemd/main.py and print the value of 'tries' after it was used as an integer. we can see it's not an integer value 100 as we expeceted.

  [Regression Potential]
  Regression potential is very low, a potential case would be if a user had set these two env before,
  after he upgraded a new version with this fix, the new real 'tries' might seems to be less than before as the previous value was apparently wrong and might wrongly enlarged.

  [other info]
  Upstream bug report:   https://tracker.ceph.com/issues/43186
  Upstream pull request: https://github.com/ceph/ceph/pull/32106

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1863704/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list