[Bug 1863704] Re: wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES and CEPH_VOLUME_SYSTEMD_INTERVAL

dongdong tao 1863704 at bugs.launchpad.net
Wed Feb 19 12:40:54 UTC 2020


proposed a disco debdiff

** Patch added: "disco.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1863704/+attachment/5329513/+files/disco.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1863704

Title:
  wrongly used a string type as int value for CEPH_VOLUME_SYSTEMD_TRIES
  and CEPH_VOLUME_SYSTEMD_INTERVAL

Status in Ubuntu Cloud Archive:
  New
Status in ceph package in Ubuntu:
  New
Status in ceph source package in Bionic:
  New
Status in ceph source package in Disco:
  New
Status in ceph source package in Eoan:
  New

Bug description:
  [Impact]
  The impact is that we can't manually change the value of env CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL. 
  The default value will make the ceph-volume-systemd only keep trying about 150 seconds, which is not enough under some circumstance that the cluter needs longer time to activate the disks (e.g disk was encrypted by vault, and this can happen when vault and osd nodes are restarted).
  So, in this case, user will need to increase the value of CEPH_VOLUME_SYSTEMD_TRIES or CEPH_VOLUME_SYSTEMD_INTERVAL to keep ceph-volume-systemd to try longer to be able to start the osd.

  [Test case]
  1. deploy a ceph cluster (we can use luminous)
  2. set env CEPH_VOLUME_SYSTEMD_TRIES to 100 on one osd node
  3. debug the ceph-volume-systemd at file src/ceph-volume/ceph_volume/systemd/main.py and print the value of 'tries' after it was used as an integer. we can see it's not an integer value 100 as we expeceted.

  [Regression Potential]
  Regression potential is very low, a potential case would be if a user had set these two env before,
  after he upgraded a new version with this fix, the new real 'tries' might seems to be less than before as the previous value was apparently wrong and might wrongly enlarged.

  [other info]
  Upstream bug report:   https://tracker.ceph.com/issues/43186
  Upstream pull request: https://github.com/ceph/ceph/pull/32106

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1863704/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list