[Bug 1838400] Re: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Alex Kavanagh
1838400 at bugs.launchpad.net
Thu Oct 31 15:38:41 UTC 2019
This doesn't seem to be a charm bug; if it is related to the charm, then
please re-open.
** Also affects: ceph (Ubuntu)
Importance: Undecided
Status: New
** Changed in: charm-ceph-osd
Status: New => Invalid
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1838400
Title:
OSD crashes when loading pgs with "FAILED assert(interval.last >
last)"
Status in OpenStack ceph-osd charm:
Invalid
Status in ceph package in Ubuntu:
New
Bug description:
This issue is tracked at https://tracker.ceph.com/issues/21142
Today, I hit it when running the following procedure:
0) nova-compute openstack-origin="cloud:xenial-queens"
1) juju upgrade-series 19 prepare bionic
2) apt-get update, dist-upgrade, do-release-upgrade, reboot
Machine 19 is hyperconverged, and runs nova-compute and ceph-osd.
Ceph was initially 12.2.8, and was updated to the latest in the
xenial-queens repo (12.2.12).
3) "ceph -s" showed half of the cluster with OSDs down due to "FAILED assert(interval.last > last)"
3.1) I tried to upgrade all ceph packages from 12.2.8 to 12.2.12 but the issue remained.
In the end, the suggestion from the linked bug was:
"""
# set [DEFAULT] ceph.conf section to
debug osd = 10/5
# from the CLI (for osd.N)
service ceph-osd at N restart
# wait until coredump is seen in the logs
service ceph-osd at N stop
perl -nle '/pg (\S+) first map.*,\ssame_interval_since/ && print$1' /var/log/ceph/ceph-osd.N.log | sort -u | xargs -I@ bash -xc 'ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-N/ --op rm-past-intervals --pgid @'
"""
Note: running the above on all the OSDs that were down fixed them.
It seems the above fix is pending backport, and also to be included in
the ubuntu ceph packaging.
4) once the upgrade of that single compute-storage node was over,
4.0) juju config nova-compute openstack-origin=distro
4.1) juju config ceph-osd source=distro
4.2) juju upgrade-series 19 complete
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-osd/+bug/1838400/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list