[Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Wed May 29 19:42:46 UTC 2019

I didn't recreate this but I did get a deployment on serverstack with
bluestore WAL and DB devices. That's done with:

1) juju deploy --series bionic --num-units 1 --constraints mem=2G
--config expected-osd-count=1 --config monitor-count=1 cs:ceph-mon ceph-
mon

2) juju deploy --series bionic --num-units 1 --constraints mem=2G
--storage osd-devices=cinder,10G --storage bluestore-wal=cinder,1G
--storage bluestore-db=cinder,1G cs:ceph-osd ceph-osd

3) juju add-relation ceph-osd ceph-mon

James Page mentioned taking a look at the systemd bits.

ceph-osd systemd unit
---------------------
/lib/systemd/system/ceph-osd at .service calls:
ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i

Where /usr/lib/ceph/ceph-osd-prestart.sh has some logic that exits with
an error code when certain things aren't ready. I think we might be able
to add something in there. For example it currently has:

data="/var/lib/ceph/osd/${cluster:-ceph}-$id"

if [ -L "$journal" -a ! -e "$journal" ]; then
    udevadm settle --timeout=5 || :
    if [ -L "$journal" -a ! -e "$journal" ]; then
        echo "ceph-osd(${cluster:-ceph}-$id): journal not present, not starting yet." 1>&2
        exit 0
    fi
fi

The 'udevadm settle' watches the udev event queue and exists if all
current events are handled or if it's been 5 seconds. Perhaps we can do
something similar for this issue.

Here's what I see in /var/log/ceph/ceph-osd.0.log during a system reboot:
-------------------------------------------------------------------------
2019-05-29 19:04:25.800237 7fa6940d1700  1 freelist shutdown
...
2019-05-29 19:04:25.800548 7fa6940d1700  1 bdev(0x557eca7a1680 /var/lib/ceph/osd/ceph-0/block.wal) close
2019-05-29 19:04:26.079227 7fa6940d1700  1 bdev(0x557eca7a1200 /var/lib/ceph/osd/ceph-0/block.db) close
2019-05-29 19:04:26.266085 7fa6940d1700  1 bdev(0x557eca7a1440 /var/lib/ceph/osd/ceph-0/block) close
2019-05-29 19:04:26.474086 7fa6940d1700  1 bdev(0x557eca7a0fc0 /var/lib/ceph/osd/ceph-0/block) close
...
2019-05-29 19:04:53.601570 7fdd2ec17e40  1 bdev create path /var/lib/ceph/osd/ceph-0/block.db type kernel
2019-05-29 19:04:53.601581 7fdd2ec17e40  1 bdev(0x561e50583200 /var/lib/ceph/osd/ceph-0/block.db) open path /var/lib/ceph/osd/ceph-0/block.db
2019-05-29 19:04:53.601855 7fdd2ec17e40  1 bdev(0x561e50583200 /var/lib/ceph/osd/ceph-0/block.db) open size 1073741824 (0x40000000, 1GiB) block_size 4096 (4KiB) rotational
2019-05-29 19:04:53.601867 7fdd2ec17e40  1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 1GiB
2019-05-29 19:04:53.602131 7fdd2ec17e40  1 bdev create path /var/lib/ceph/osd/ceph-0/block type kernel
2019-05-29 19:04:53.602143 7fdd2ec17e40  1 bdev(0x561e50583440 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2019-05-29 19:04:53.602464 7fdd2ec17e40  1 bdev(0x561e50583440 /var/lib/ceph/osd/ceph-0/block) open size 10733223936 (0x27fc00000, 10.0GiB) block_size 4096 (4KiB) rotational
2019-05-29 19:04:53.602480 7fdd2ec17e40  1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-0/block size 10.0GiB
2019-05-29 19:04:53.602499 7fdd2ec17e40  1 bdev create path /var/lib/ceph/osd/ceph-0/block.wal type kernel
2019-05-29 19:04:53.602502 7fdd2ec17e40  1 bdev(0x561e50583680 /var/lib/ceph/osd/ceph-0/block.wal) open path /var/lib/ceph/osd/ceph-0/block.wal
2019-05-29 19:04:53.602709 7fdd2ec17e40  1 bdev(0x561e50583680 /var/lib/ceph/osd/ceph-0/block.wal) open size 100663296 (0x6000000, 96MiB) block_size 4096 (4KiB) rotational
2019-05-29 19:04:53.602717 7fdd2ec17e40  1 bluefs add_block_device bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 96MiB
...

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in ceph package in Ubuntu:
  New

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical devices.
  LVM module is supposed to create PVs from devices using the links in /dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that Ceph services cannot start, and pvdisplay doesn't show any volumes created. The folder /dev/disk/by-dname/ however has all necessary device created by the end of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1828617/+subscriptions