[Bug 1863704] Re: wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES and CEPH_VOLUME_SYSTEMD_INTERVAL

James Page james.page at ubuntu.com
Thu Apr 9 09:48:26 UTC 2020


*** This bug is a duplicate of bug 1804261 ***
    https://bugs.launchpad.net/bugs/1804261

** Changed in: ceph (Ubuntu Focal)
       Status: Triaged => Fix Released

** Changed in: ceph (Ubuntu Disco)
       Status: Triaged => Won't Fix

** This bug has been marked a duplicate of bug 1804261
   Ceph OSD units requires reboot if they boot before vault (and if not unsealed with 150s)

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1863704

Title:
  wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES
  and CEPH_VOLUME_SYSTEMD_INTERVAL

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Triaged
Status in Ubuntu Cloud Archive rocky series:
  Triaged
Status in Ubuntu Cloud Archive stein series:
  Triaged
Status in Ubuntu Cloud Archive train series:
  Triaged
Status in ceph package in Ubuntu:
  Fix Released
Status in ceph source package in Bionic:
  Triaged
Status in ceph source package in Disco:
  Won't Fix
Status in ceph source package in Eoan:
  Triaged
Status in ceph source package in Focal:
  Fix Released

Bug description:
  [Impact]
  The impact is that we can't manually change the value of env CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL. 
  The default value will make the ceph-volume-systemd only keep trying about 150 seconds, which is not enough under some circumstance that the cluter needs longer time to activate the disks (e.g disk was encrypted by vault, and this can happen when vault and osd nodes are restarted).
  So, in this case, user will need to increase the value of CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL to keep ceph-volume-systemd to try longer to be able to start the osd.

  [Test case]
  1. deploy a ceph cluster (we can use luminous)
  2. set env CEPH_VOLUME_SYSTEMD_TRIES to 100 on one osd node
  3. debug the ceph-volume-systemd at file src/ceph-volume/ceph_volume/systemd/main.py and print the value of 'tries' after it was used as an integer. we can see it's not an integer value 100 as we expeceted.

  [Regression Potential]
  Regression potential is very low, a potential case would be if a user had set these two env before,
  after he upgraded a new version with this fix, the new real 'tries' might seems to be less than before as the previous value was apparently wrong and might wrongly enlarged.

  [other info]
  Upstream bug report:   https://tracker.ceph.com/issues/43186
  Upstream pull request: https://github.com/ceph/ceph/pull/32106

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1863704/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list