[Bug 1838400] Re: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Launchpad Bug Tracker
1838400 at bugs.launchpad.net
Tue Jun 9 04:17:18 UTC 2020
[Expired for ceph (Ubuntu) because there has been no activity for 60
days.]
** Changed in: ceph (Ubuntu)
Status: Incomplete => Expired
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1838400
Title:
OSD crashes when loading pgs with "FAILED assert(interval.last >
last)"
Status in OpenStack ceph-osd charm:
Invalid
Status in ceph package in Ubuntu:
Expired
Bug description:
This issue is tracked at https://tracker.ceph.com/issues/21142
Today, I hit it when running the following procedure:
0) nova-compute openstack-origin="cloud:xenial-queens"
1) juju upgrade-series 19 prepare bionic
2) apt-get update, dist-upgrade, do-release-upgrade, reboot
Machine 19 is hyperconverged, and runs nova-compute and ceph-osd.
Ceph was initially 12.2.8, and was updated to the latest in the
xenial-queens repo (12.2.12).
3) "ceph -s" showed half of the cluster with OSDs down due to "FAILED assert(interval.last > last)"
3.1) I tried to upgrade all ceph packages from 12.2.8 to 12.2.12 but the issue remained.
In the end, the suggestion from the linked bug was:
"""
# set [DEFAULT] ceph.conf section to
debug osd = 10/5
# from the CLI (for osd.N)
service ceph-osd at N restart
# wait until coredump is seen in the logs
service ceph-osd at N stop
perl -nle '/pg (\S+) first map.*,\ssame_interval_since/ && print$1' /var/log/ceph/ceph-osd.N.log | sort -u | xargs -I@ bash -xc 'ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-N/ --op rm-past-intervals --pgid @'
"""
Note: running the above on all the OSDs that were down fixed them.
It seems the above fix is pending backport, and also to be included in
the ubuntu ceph packaging.
4) once the upgrade of that single compute-storage node was over,
4.0) juju config nova-compute openstack-origin=distro
4.1) juju config ceph-osd source=distro
4.2) juju upgrade-series 19 complete
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-osd/+bug/1838400/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list