[Bug 1682424] Re: All monitors crashes after OSD host reboot

George Shuklin 1682424 at bugs.launchpad.net
Thu Apr 13 13:03:36 UTC 2017


related ceph bug: http://tracker.ceph.com/issues/19606

** Description changed:

  All three monitors have been crashing (repeatedly) after host with few
  OSDs had been rebooted.
  
  Trace from monitor:
  ...
-     -2> 2017-04-13 12:34:39.204681 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204681, event: osdmap:prepare_update, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
-     -1> 2017-04-13 12:34:39.204693 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204692, event: osdmap:prepare_boot, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
-      0> 2017-04-13 12:34:39.213266 7fd2857aa700 -1 mon/OSDMonitor.cc: In function 'bool OSDMonitor::prepare_boot(MonOpRequestRef)' thread 7fd2857aa700 time 2017-04-13 12:34:39.204709
+     -2> 2017-04-13 12:34:39.204681 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204681, event: osdmap:prepare_update, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
+     -1> 2017-04-13 12:34:39.204693 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204692, event: osdmap:prepare_boot, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
+      0> 2017-04-13 12:34:39.213266 7fd2857aa700 -1 mon/OSDMonitor.cc: In function 'bool OSDMonitor::prepare_boot(MonOpRequestRef)' thread 7fd2857aa700 time 2017-04-13 12:34:39.204709
  mon/OSDMonitor.cc: 2105: FAILED assert(osdmap.get_uuid(from) == m->sb.osd_fsid)
  
-  ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
-  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x55f6c17cc260]
-  2: (OSDMonitor::prepare_boot(std::shared_ptr<MonOpRequest>)+0x1bd2) [0x55f6c1477e82]
-  3: (OSDMonitor::prepare_update(std::shared_ptr<MonOpRequest>)+0x28b) [0x55f6c14aaa1b]
-  4: (PaxosService::dispatch(std::shared_ptr<MonOpRequest>)+0xb4f) [0x55f6c145b84f]
-  5: (PaxosService::C_RetryMessage::_finish(int)+0x58) [0x55f6c145ce38]
-  6: (C_MonOp::finish(int)+0x82) [0x55f6c14250c2]
-  7: (Context::complete(int)+0x9) [0x55f6c14241a9]
-  8: (void finish_contexts<Context>(CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1fb) [0x55f6c142a8db]
-  9: (Paxos::finish_round()+0x287) [0x55f6c14512b7]
-  10: (Paxos::handle_last(std::shared_ptr<MonOpRequest>)+0xe19) [0x55f6c1452499]
-  11: (Paxos::dispatch(std::shared_ptr<MonOpRequest>)+0x250) [0x55f6c1452cc0]
-  12: (Monitor::dispatch_op(std::shared_ptr<MonOpRequest>)+0xa38) [0x55f6c141e5c8]
-  13: (Monitor::_ms_dispatch(Message*)+0x554) [0x55f6c141edc4]
-  14: (Monitor::ms_dispatch(Message*)+0x23) [0x55f6c1441e93]
-  15: (DispatchQueue::entry()+0xf2b) [0x55f6c18c1fab]
-  16: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f6c17b25ad]
-  17: (()+0x76fa) [0x7fd28de0e6fa]
-  18: (clone()+0x6d) [0x7fd28c0c8b5d]
-  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
+  ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
+  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x55f6c17cc260]
+  2: (OSDMonitor::prepare_boot(std::shared_ptr<MonOpRequest>)+0x1bd2) [0x55f6c1477e82]
+  3: (OSDMonitor::prepare_update(std::shared_ptr<MonOpRequest>)+0x28b) [0x55f6c14aaa1b]
+  4: (PaxosService::dispatch(std::shared_ptr<MonOpRequest>)+0xb4f) [0x55f6c145b84f]
+  5: (PaxosService::C_RetryMessage::_finish(int)+0x58) [0x55f6c145ce38]
+  6: (C_MonOp::finish(int)+0x82) [0x55f6c14250c2]
+  7: (Context::complete(int)+0x9) [0x55f6c14241a9]
+  8: (void finish_contexts<Context>(CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1fb) [0x55f6c142a8db]
+  9: (Paxos::finish_round()+0x287) [0x55f6c14512b7]
+  10: (Paxos::handle_last(std::shared_ptr<MonOpRequest>)+0xe19) [0x55f6c1452499]
+  11: (Paxos::dispatch(std::shared_ptr<MonOpRequest>)+0x250) [0x55f6c1452cc0]
+  12: (Monitor::dispatch_op(std::shared_ptr<MonOpRequest>)+0xa38) [0x55f6c141e5c8]
+  13: (Monitor::_ms_dispatch(Message*)+0x554) [0x55f6c141edc4]
+  14: (Monitor::ms_dispatch(Message*)+0x23) [0x55f6c1441e93]
+  15: (DispatchQueue::entry()+0xf2b) [0x55f6c18c1fab]
+  16: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f6c17b25ad]
+  17: (()+0x76fa) [0x7fd28de0e6fa]
+  18: (clone()+0x6d) [0x7fd28c0c8b5d]
+  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
+ 
+ 
+ Affected version: 10.2.6-0ubuntu0.16.04.1

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1682424

Title:
  All monitors crashes after OSD host reboot

Status in ceph package in Ubuntu:
  New

Bug description:
  All three monitors have been crashing (repeatedly) after host with few
  OSDs had been rebooted.

  Trace from monitor:
  ...
      -2> 2017-04-13 12:34:39.204681 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204681, event: osdmap:prepare_update, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
      -1> 2017-04-13 12:34:39.204693 7fd2857aa700  5 -- op tracker -- seq: 22, time: 2017-04-13 12:34:39.204692, event: osdmap:prepare_boot, op: osd_boot(osd.21 booted 0 features 576460752032874495 v2022)
       0> 2017-04-13 12:34:39.213266 7fd2857aa700 -1 mon/OSDMonitor.cc: In function 'bool OSDMonitor::prepare_boot(MonOpRequestRef)' thread 7fd2857aa700 time 2017-04-13 12:34:39.204709
  mon/OSDMonitor.cc: 2105: FAILED assert(osdmap.get_uuid(from) == m->sb.osd_fsid)

   ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x55f6c17cc260]
   2: (OSDMonitor::prepare_boot(std::shared_ptr<MonOpRequest>)+0x1bd2) [0x55f6c1477e82]
   3: (OSDMonitor::prepare_update(std::shared_ptr<MonOpRequest>)+0x28b) [0x55f6c14aaa1b]
   4: (PaxosService::dispatch(std::shared_ptr<MonOpRequest>)+0xb4f) [0x55f6c145b84f]
   5: (PaxosService::C_RetryMessage::_finish(int)+0x58) [0x55f6c145ce38]
   6: (C_MonOp::finish(int)+0x82) [0x55f6c14250c2]
   7: (Context::complete(int)+0x9) [0x55f6c14241a9]
   8: (void finish_contexts<Context>(CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1fb) [0x55f6c142a8db]
   9: (Paxos::finish_round()+0x287) [0x55f6c14512b7]
   10: (Paxos::handle_last(std::shared_ptr<MonOpRequest>)+0xe19) [0x55f6c1452499]
   11: (Paxos::dispatch(std::shared_ptr<MonOpRequest>)+0x250) [0x55f6c1452cc0]
   12: (Monitor::dispatch_op(std::shared_ptr<MonOpRequest>)+0xa38) [0x55f6c141e5c8]
   13: (Monitor::_ms_dispatch(Message*)+0x554) [0x55f6c141edc4]
   14: (Monitor::ms_dispatch(Message*)+0x23) [0x55f6c1441e93]
   15: (DispatchQueue::entry()+0xf2b) [0x55f6c18c1fab]
   16: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f6c17b25ad]
   17: (()+0x76fa) [0x7fd28de0e6fa]
   18: (clone()+0x6d) [0x7fd28c0c8b5d]
   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

  
  Affected version: 10.2.6-0ubuntu0.16.04.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1682424/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list