[Bug 1713032] Re: [luminous] ceph-mon crashes when it is elected leader (s390x)
Andrew McLeod
andrew.mcleod at canonical.com
Tue Jan 22 14:39:17 UTC 2019
If you add the debug symbols repo as follows here:
https://wiki.ubuntu.com/Debug%20Symbol%20Packages#Manual_install_of_debug_packages
then the debug symbols should be available for install - it was
mentioned that they may be available for luminous but not mimic - in
this case, luminous can be used as the bug is exactly the same between
both versions
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1713032
Title:
[luminous] ceph-mon crashes when it is elected leader (s390x)
Status in Ubuntu Cloud Archive:
Triaged
Status in Ubuntu on IBM z Systems:
In Progress
Status in ceph package in Ubuntu:
In Progress
Bug description:
ceph-mon - s390x - 1 of my 3 nodes decides it is the leader, then
crashes:
Summary:
2017-08-25 10:30:49.764717 3ff9a7ff910 1 mon.juju-a9ec9d-1 at 0(electing).elector(105) init, last seen epoch 105
2017-08-25 10:30:55.288336 3ff9a7ff910 0 log_channel(cluster) log [INF] : mon.juju-a9ec9d-1 at 0 won leader election with quorum 0,1
2017-08-25 10:30:55.487872 3ff9a7ff910 0 log_channel(cluster) log [INF] : HEALTH_ERR; no osds; 1 mons down, quorum 0,1 juju-a9ec9d-1,juju-a9ec9d-0
2017-08-25 10:30:56.047020 3ff8bfff910 0 log_channel(cluster) log [INF] : monmap e1: 3 mons at {juju-a9ec9d-0=10.0.8.105:6789/0,juju-a9ec9d-1=10.0.8.84:6789/0,noname-b=10.0.8.179:6789/0}
2017-08-25 10:30:56.047050 3ff8bfff910 0 log_channel(cluster) log [INF] : pgmap 0 pgs: ; 0 bytes data, 0 kB used, 0 kB / 0 kB avail
2017-08-25 10:30:56.047073 3ff8bfff910 0 log_channel(cluster) log [DBG] : fsmap
2017-08-25 10:30:56.047122 3ff8bfff910 1 mon.juju-a9ec9d-1 at 0(leader).osd e0 create_pending setting backfillfull_ratio = 0.9
2017-08-25 10:30:56.047135 3ff8bfff910 1 mon.juju-a9ec9d-1 at 0(leader).osd e0 create_pending setting full_ratio = 0.95
2017-08-25 10:30:56.047137 3ff8bfff910 1 mon.juju-a9ec9d-1 at 0(leader).osd e0 create_pending setting nearfull_ratio = 0.85
2017-08-25 10:30:56.047288 3ff8bfff910 1 mon.juju-a9ec9d-1 at 0(leader).osd e0 encode_pending skipping prime_pg_temp; mapping job did not start
2017-08-25 10:30:56.051808 3ff8bfff910 -1 *** Caught signal (Aborted) **
in thread 3ff8bfff910 thread_name:ms_dispatch
ceph version 12.1.2 (b661348f156f148d764b998b65b90451f096cb27) luminous (rc)
1: (()+0x9334b4) [0x2aa0f9334b4]
2: [0x3ff8bff9b66]
3: (gsignal()+0x30) [0x3ffa16381b8]
4: (abort()+0x14e) [0x3ffa1639726]
5: (__gnu_cxx::__verbose_terminate_handler()+0x19c) [0x3ffa1a28e2c]
6: (()+0xa6776) [0x3ffa1a26776]
7: (()+0xa67d8) [0x3ffa1a267d8]
8: (__cxa_rethrow()+0x64) [0x3ffa1a26adc]
9: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0xdc2) [0x2aa0f8b4d92]
10: (OSDMap::decode(ceph::buffer::list::iterator&)+0x5c4) [0x2aa0f739d44]
11: (OSDMap::decode(ceph::buffer::list&)+0x44) [0x2aa0f73c434]
12: (OSDMap::apply_incremental(OSDMap::Incremental const&)+0x1782) [0x2aa0f73dbe2]
13: (OSDMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)+0x212) [0x2aa0f55cf3a]
14: (PaxosService::propose_pending()+0x2be) [0x2aa0f5214d6]
15: (PaxosService::_active()+0x2be) [0x2aa0f521bbe]
16: (Context::complete(int)+0x1e) [0x2aa0f3f1d86]
17: (void finish_contexts<Context>(CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x212) [0x2aa0f3fb0ea]
18: (Paxos::finish_round()+0x194) [0x2aa0f51937c]
19: (Paxos::handle_last(boost::intrusive_ptr<MonOpRequest>)+0xfb2) [0x2aa0f51a97a]
20: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x2d4) [0x2aa0f51b2c4]
21: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xf20) [0x2aa0f3e86b8]
22: (Monitor::_ms_dispatch(Message*)+0x64e) [0x2aa0f3e91ae]
23: (Monitor::ms_dispatch(Message*)+0x34) [0x2aa0f41919c]
24: (DispatchQueue::entry()+0xf0c) [0x2aa0f8df744]
25: (DispatchQueue::DispatchThread::entry()+0x18) [0x2aa0f6ed828]
26: (()+0x7934) [0x3ffa1e87934]
27: (()+0xedd1a) [0x3ffa16edd1a]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Full log:
https://pastebin.canonical.com/196718/
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: ceph 12.1.2-0ubuntu2~cloud0 [origin: Canonical]
ProcVersionSignature: Ubuntu 4.4.0-87.110-generic 4.4.73
Uname: Linux 4.4.0-87-generic s390x
NonfreeKernelModules: ebtable_broute vport_gre ip_gre gre ip_tunnel xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nfnetlink xt_REDIRECT nf_nat_redirect nf_conntrack_ipv6 ip6table_mangle xt_nat xt_mark xt_connmark ip6table_raw iptable_raw xt_conntrack ipt_REJECT nf_reject_ipv4 ebtable_filter nbd openvswitch nf_defrag_ipv6 ebt_arp ebt_dnat ebt_ip scsi_transport_iscsi binfmt_misc veth ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack zfs zunicode zcommon znvpair spl zavl zlib_deflate iptable_filter ip_tables ebt_snat ebtable_nat ebtables x_tables bridge 8021q garp mrp stp llc xfs libcrc32c dm_snapshot dm_bufio ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 qeth_l2 sha_common chsc_sch eadm_sch qeth ctcm ccwgroup fsm zfcp qdio scsi_transport_fc dasd_eckd_mod dasd_mod
ApportVersion: 2.20.1-0ubuntu2.10
Architecture: s390x
CrashDB:
{
"impl": "launchpad",
"project": "cloud-archive",
"bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml",
}
Date: Fri Aug 25 11:17:36 2017
SourcePackage: ceph
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1713032/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list