[Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Ɓukasz Zemczak 1828617 at bugs.launchpad.net
Thu Sep 5 12:32:40 UTC 2019


Hello Andrey, or anyone else affected,

Accepted ceph into disco-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/ceph/13.2.6-0ubuntu0.19.04.4 in a
few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-disco to verification-done-disco. If it does not fix
the bug for you, please add a comment stating that, and change the tag
to verification-failed-disco. In either case, without details of your
testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: ceph (Ubuntu Disco)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-disco

** Changed in: ceph (Ubuntu Bionic)
       Status: In Progress => Fix Committed

** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in Ubuntu Cloud Archive:
  In Progress
Status in Ubuntu Cloud Archive queens series:
  In Progress
Status in Ubuntu Cloud Archive rocky series:
  In Progress
Status in Ubuntu Cloud Archive stein series:
  In Progress
Status in Ubuntu Cloud Archive train series:
  In Progress
Status in ceph package in Ubuntu:
  Fix Released
Status in ceph source package in Bionic:
  Fix Committed
Status in ceph source package in Disco:
  Fix Committed
Status in ceph source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  For deployments where the bluestore DB and WAL devices are on separate underlying OSD's, its possible on reboot that the LV's configured on these devices have not yet been scanned and detected; the OSD boot process ignores this fact and tries to boot the OSD anyway as soon as the primary LV supporting the OSD is detected, resulting in the OSD crashing as required block device symlinks are not present.

  [Test Case]
  Deploy ceph with bluestore + separate DB and WAL devices.
  Reboot servers
  OSD devices will fail to start after reboot (its a race so not always).

  [Regression Potential]
  Low - the fix has been landed upstream and simple ensures that if a separate LV is expected for the DB and WAL devices for an OSD, the OSD will not try to boot until they are present.

  [Original Bug Report]
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical devices.
  LVM module is supposed to create PVs from devices using the links in /dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that Ceph services cannot start, and pvdisplay doesn't show any volumes created. The folder /dev/disk/by-dname/ however has all necessary device created by the end of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1828617/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list