[Bug 1943423] Re: mgr crashs in 16.2.5 / clock-skew

sascha arthur 1943423 at bugs.launchpad.net
Fri Sep 17 12:47:47 UTC 2021


added following cron on top of ntpd, to see if this solves the issue:

0 * * * * /usr/sbin/hwclock -w --verbose --update-drift >>
/tmp/hwclock.log

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1943423

Title:
  mgr crashs in 16.2.5 / clock-skew

Status in ceph package in Ubuntu:
  New

Bug description:
  Hello,

  Running inside an KVM, impish with latest ceph version. 
  Can at least reproduce it in 3 reinstalled fresh ceph clusters.

  Heres the crash info for my mgr's:

  ceph crash info
  2021-09-12T21:09:22.866793Z_2419107c-082c-457a-b5e6-a376d779b32f

  {
      "archived": "2021-09-13 07:59:37.681606",
      "backtrace": [
          "/lib/x86_64-linux-gnu/libc.so.6(+0x46510) [0x7f62e074e510]",
          "(std::_Rb_tree_increment(std::_Rb_tree_node_base*)+0x13) [0x7f62e0af5573]",
          "(PGMap::apply_incremental(ceph::common::CephContext*, PGMap::Incremental const&)+0xb60) [0x5633639f6320]",
          "(ClusterState::notify_osdmap(OSDMap const&)+0x29d) [0x563363a89f1d]",
          "(Mgr::handle_osd_map()+0x854) [0x563363ae0694]",
          "(Mgr::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x568) [0x563363ae0eb8]",
          "(MgrStandby::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0xb8) [0x563363af1118]",
          "(Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x450) [0x7f62e1101d30]",
          "(DispatchQueue::entry()+0x647) [0x7f62e10ff0e7]",
          "(DispatchQueue::DispatchThread::entry()+0x11) [0x7f62e11be921]",
          "/lib/x86_64-linux-gnu/libc.so.6(+0x988d7) [0x7f62e07a08d7]",
          "/lib/x86_64-linux-gnu/libc.so.6(+0x129510) [0x7f62e0831510]"
      ],
      "ceph_version": "16.2.5",
      "crash_id": "2021-09-12T21:09:22.866793Z_2419107c-082c-457a-b5e6-a376d779b32f",
      "entity_name": "mgr.ceph-00002",
      "os_id": "21.10",
      "os_name": "Ubuntu Impish Indri (development branch)",
      "os_version": "21.10 (Impish Indri)",
      "os_version_id": "21.10",
      "process_name": "ceph-mgr",
      "stack_sig": "eccaccb958ebf382237486176ce43b704db9b0ec4b004a7697e77140821e88e9",
      "timestamp": "2021-09-12T21:09:22.866793Z",
      "utsname_hostname": "ceph-00002.dc-003.xxx",
      "utsname_machine": "x86_64",
      "utsname_release": "5.13.0-14-generic",
      "utsname_sysname": "Linux",
      "utsname_version": "#14-Ubuntu SMP Mon Aug 2 12:43:35 UTC 2021"
  }

  
  On top MGRs having sometimes "clock-skew" issues, even though the daemon is running its loosing connection and kicked out of the cluster. For sure Host and KVM is ntp synchronized.

  Not sure if this "clock-skew" is related to this crash here, but will
  post the log as soon as i have it again.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1943423/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list