[Bug 1804261] Re: Ceph OSD units requires reboot if they boot before vault (and if not unsealed with 150s)

dongdong tao 1804261 at bugs.launchpad.net
Mon Jun 15 05:40:08 UTC 2020


I have verified the fix in bionic-proposed and confirm it can fix this issue. 
The test steps I've performed:
1. deployed a ceph cluster with vault
2. upgrade some of the osds to 12.2.13
3. Add "Environment=CEPH_VOLUME_SYSTEMD_TRIES=2000" at /lib/systemd/system/ceph-volume at .service for all osds
4. First reboot vault, then reboot all osds
5. Wait for about 1.5 hour
6. All osds with version 12.2.13 can come up, while other osds with 12.2.12 remain blocked

Cheers!


** Tags removed: verification-needed verification-queens-needed
** Tags added: verification-done verification-queens-done

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1804261

Title:
  Ceph OSD units requires reboot if they boot before vault (and if not
  unsealed with 150s)

Status in OpenStack ceph-osd charm:
  Invalid
Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive queens series:
  Fix Committed
Status in Ubuntu Cloud Archive rocky series:
  Fix Released
Status in Ubuntu Cloud Archive stein series:
  Fix Released
Status in Ubuntu Cloud Archive train series:
  Fix Released
Status in Ubuntu Cloud Archive ussuri series:
  Fix Released
Status in ceph package in Ubuntu:
  Fix Released
Status in ceph source package in Bionic:
  Fix Committed
Status in ceph source package in Disco:
  Won't Fix
Status in ceph source package in Eoan:
  Fix Released
Status in ceph source package in Focal:
  Fix Released

Bug description:
  [Impact]
  Various configuration option values that are read from environment variables are incorrectly parsed as strings rather than ints which means that for certain deployment use-cases, the timeouts for starting the ceph-osd volume units cannot be increased to accommodate dependencies starting first.

  [Test Case]
  Deploy ceph with vault for key management
  set a systemd override for ceph-volume@
  Environment=CEPH_VOLUME_SYSTEMD_TRIES=2000
  Seal vault units (by restarting the vault service)
  Reboot ceph-osd machines - Environment override is ignored as its not correctly parsed.

  [Regression Potential]
  Low - this fix has been accept upstream in later releases.

  
  [Original Bug Report]
  In a scenario where Ceph is encrypted and using Vault as the keymanager, in a scenario where vault and ceph are both stopped, any OSDs on the unit(s) affected will require a further reboot if they try to start before vault is unsealed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-osd/+bug/1804261/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list