[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

Maksym Medvied 2089565 at bugs.launchpad.net
Sat Dec 21 20:16:21 UTC 2024


Let's find this offset in the disassembled function:

(gdb) disassemble/m 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)'
Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE:
Address range 0x7ffff7cc2e10 to 0x7ffff7cc3c4d:
837     {
   0x00007ffff7cc2e10 <+0>:     endbr64
   0x00007ffff7cc2e14 <+4>:     push   %rbp
   0x00007ffff7cc2e15 <+5>:     mov    %rsp,%rbp
...
963       if (ev >= 17) {                                                           
   0x00007ffff7cc3a65 <+3157>:  cmp    $0x10,%r12w                                  
   0x00007ffff7cc3a6a <+3162>:  je     0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>    
                                                                                    
964         decode(bal_rank_mask, p);                                               
   0x00007ffff7cc3a70 <+3168>:  lea    -0x2a4(%rbp),%rdx                            
   0x00007ffff7cc3a77 <+3175>:  mov    $0x4,%esi                                    
   0x00007ffff7cc3a7c <+3180>:  mov    %r13,%rdi                                    
   0x00007ffff7cc3a7f <+3183>:  lea    0x1c0(%rbx),%r14                             
                                                                                    
965       }                                                                         
966                                                                                 
967       if (ev >= 18) {                                                           
   0x00007ffff7cc3ab1 <+3233>:  cmp    $0x11,%r12w                                  
   0x00007ffff7cc3ab6 <+3238>:  je     0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>    
                                                                                    
968         decode(max_xattr_size, p);                                              
969       }                                                                         
970                                                                                 
971       if (ev >= 19) {                                                           
   0x00007ffff7cc3ade <+3278>:  cmp    $0x12,%r12w                                  
...

The return address is 0x00007ffff7cc3ab1 <+3233>, so we're looking for a
call just before that.

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2089565

Title:
  MON and MDS crash upgrading  CEPH  on ubuntu 24.04 LTS

Status in ceph package in Ubuntu:
  Confirmed

Bug description:
  This issue is a continuation of
  https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2065515

  
  On Ubuntu 24.04 lts we did upgrade Ceph to  19.2.0-0ubuntu0.24.04.1

  Previous release is : 19.2.0~git20240301.4c76c50-0ubuntu6

  whenever  upgrading (tested on 2 different clusters)  the ceph-mon
  ends up crashing repeatedly with the below stack error

  ```
   ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)
   1: /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x788409245320]
   2: pthread_kill()
   3: gsignal()
   4: abort()
   5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa5ff5) [0x7884096a5ff5]
   6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da) [0x7884096bb0da]
   7: (std::unexpected()+0) [0x7884096a5a55]
   8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb391) [0x7884096bb391]
   9: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0x193) [0x78840a293593]
   10: (MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xca1) [0x78840a4c3ab1]
   11: (Filesystem::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1c3) [0x78840a4e4303]
   12: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x280) [0x78840a4e6ef0]
   13: (MDSMonitor::update_from_paxos(bool*)+0x291) [0x631ac5dea801]
   14: (Monitor::refresh_from_paxos(bool*)+0x124) [0x631ac5b7a164]
   15: (Monitor::preinit()+0x98e) [0x631ac5bb2fbe]
   16: main()
   17: /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x78840922a1ca]
   18: __libc_start_main()
   19: _start()
   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

  ```

  
  mitigation:
  a rollback to the previous release 19.2.0~git20240301.4c76c50-0ubuntu6 is still possible to restore service

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2089565/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list