[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS
Maksym Medvied
2089565 at bugs.launchpad.net
Sat Dec 21 19:28:04 UTC 2024
This is the SIGABRT stack backtrace:
1: /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x749752045320]
2: pthread_kill()
3: gsignal()
4: abort()
5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa5ff5) [0x7497524a5ff5]
6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da) [0x7497524bb0da]
7: (std::unexpected()+0) [0x7497524a5a55]
8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb391) [0x7497524bb391]
9: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0x193) [0x749753093593]
10: (MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xca1) [0x7497532c3ab1]
11: (Filesystem::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1c3) [0x7497532e4303]
12: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x280) [0x7497532e6ef0]
13: (MDSMonitor::update_from_paxos(bool*)+0x291) [0x600eddf89801]
14: (Monitor::refresh_from_paxos(bool*)+0x124) [0x600eddd19164]
15: (Monitor::preinit()+0x98e) [0x600eddd51fbe]
16: main()
17: /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x74975202a1ca]
18: __libc_start_main()
19: _start()
>From what we can see here the last Ceph-related frame is 9, and
list::iterator_impl looks like something generic. The previous frame 10
is in MDSMap::decode(), and it's a great place to have version
incompatibility. Let's dig deeper into the frame.
To start, we need to figure out what our binary is. We see
0> 2024-11-25T13:53:45.524+0000 74975268ba80 -1 *** Caught signal (Aborted) **
in thread 74975268ba80 thread_name:ceph-mon
just before the stack backtrace, and by searching for "ceph-mon"
backward we see
-365> 2024-11-25T13:53:45.514+0000 74975268ba80 0 ceph version 19.2.0
(16063ff2022298c9300e49a547a16ffda59baf13) squid (stable), process ceph-
mon, pid 1304994
so it's likely that the binary is ceph-mon and the git version is 16063ff2022298c9300e49a547a16ffda59baf13.
To start let's see if there is a separate package for the binary:
> apt search ceph-mon
Sorting... Done
Full Text Search... Done
ceph-base/noble-updates 19.2.0-0ubuntu0.24.04.1 amd64
common ceph daemon libraries and management tools
ceph-mon/noble-updates 19.2.0-0ubuntu0.24.04.1 amd64
monitor server for the ceph storage system
Let's see if we could find the binary in the package
apt download ceph-mon
> dpkg-deb --verbose --raw-extract ./ceph-mon_19.2.0-0ubuntu0.24.04.1_amd64.deb ./
...
./usr/bin/ceph-mon
...
We were lucky that the binary name is the package name and the binary is in that package.
Now we know the exact package version 19.2.0-0ubuntu0.24.04.1 that is currently in the archive. This is the same version that is mentioned in the bug report as the "new" Ceph version. The "old" version mentioned in the bug report is 19.2.0~git20240301.4c76c50-0ubuntu6.
Let's compare the sources for MDSMap::decode() to see if it changed between the versions - if so, it would be a good suspect.
The Ceph source for the Ceph packages is in
https://git.launchpad.net/ubuntu/+source/ceph.
git clone https://git.launchpad.net/ubuntu/+source/ceph
cd ceph
> git grep -n MDSMap::decode
src/mds/FSMap.cc:1086: * Insert INLINE; see comment in MDSMap::decode.
src/mds/MDSMap.cc:836:void MDSMap::decode(bufferlist::const_iterator& p)
So we're interested in src/mds/MDSMap.cc (if the file was not renamed
and the function was not moved).
Let's get the file for 2 different revisions, extract MDSMap::decode()
function from both and then compare to see the difference.
> git tag | grep 19.2.0-0ubuntu0.24.04.1
applied/19.2.0-0ubuntu0.24.04.1
import/19.2.0-0ubuntu0.24.04.1
> git show applied/19.2.0-0ubuntu0.24.04.1:src/mds/MDSMap.cc > /tmp/MDSMap.cc.new
The old version is 19.2.0~git20240301.4c76c50-0ubuntu6, the closest tag
(by name) in the repo is applied/19.2.0_git20240301.4c76c50-0ubuntu6:
> git show applied/19.2.0_git20240301.4c76c50-0ubuntu6:src/mds/MDSMap.cc
> /tmp/MDSMap.cc.old
After running diff for the files we see that both encode and decode
functions were changed. This is the relevant part for the decode
function:
> diff -u /tmp/MDSMap.cc.old /tmp/MDSMap.cc.new
...
@@ -852,7 +863,8 @@
decode(cas_pool, p);
}
- // kclient ignores everything from here
+ // kclient skips most of what's below
+ // see fs/ceph/mdsmap.c for current decoding
__u16 ev = 1;
if (struct_v >= 2)
decode(ev, p);
@@ -949,11 +961,16 @@
}
if (ev >= 17) {
- decode(max_xattr_size, p);
+ decode(bal_rank_mask, p);
}
if (ev >= 18) {
- decode(bal_rank_mask, p);
+ decode(max_xattr_size, p);
+ }
+
+ if (ev >= 19) {
+ decode(qdb_cluster_leader, p);
+ decode(qdb_cluster_members, p);
}
/* All MDS since at least v14.0.0 understand INLINE */
We see that the order of fields and the number of fields changed in the
decode() function, and it doesn't seem to be an error handling for the
cases when the format is incorrect.
Now let's explore the binary to see where exactly is the panic in
MDSMap::decode().
We have ceph-mon binary extracted earlier. We could load it in gdb,
which should provide disassembled versions of the functions. We could
also try to load debuginfo and put the source tree at the right place to
get even better symbols and source references.
> gdb ./usr/bin/ceph-mon
...
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
...
(gdb) start
Downloading source file /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc
Temporary breakpoint 1 at 0x32c670: file /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc, line 250.
...
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdf98)
at /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc:250
warning: 250 /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc: No such file or directory
(gdb)
Now we know that it's looking for the source tree in
/usr/src/ceph-19.2.0-0ubuntu0.24.04.1/. Let's put the tree there (you
may need to add "deb-src" after "deb" (so it becomes "deb deb-src") in
/etc/apt/sources.list.d/ubuntu.sources):
> cd /usr/src/
> sudo apt source ceph
Now we see that the dir with the Ceph source is is ceph-19.2.0. Let's
create a symlink so gdb would be able to find it:
> sudo ln -sv ceph-19.2.0 ceph-19.2.0-0ubuntu0.24.04.1
'ceph-19.2.0-0ubuntu0.24.04.1' -> 'ceph-19.2.0'
Let's restart gdb with ceph-mon again:
(gdb) start
Temporary breakpoint 1 at 0x32c670: file /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc, line 250.
Starting program: /tmp/2/usr/bin/ceph-mon
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdf98)
at /usr/src/ceph-19.2.0-0ubuntu0.24.04.1/src/ceph_mon.cc:250
250 {
(gdb) l
245 }
246 return addrs;
247 }
248
249 int main(int argc, const char **argv)
250 {
251 // reset our process name, in case we did a respawn, so that it's not
252 // left as "exe".
253 ceph_pthread_setname(pthread_self(), "ceph-mon");
254
Now we see the sources. The part of the backtrace that we want to know
more about is
10:
(MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xca1)
[0x7497532c3ab1]
Let's see what's there:
(gdb) set pagination off
(gdb) disassemble 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)'
Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE:
Address range 0x7ffff7cc2e10 to 0x7ffff7cc3c4d:
0x00007ffff7cc2e10 <+0>: endbr64
0x00007ffff7cc2e14 <+4>: push %rbp
0x00007ffff7cc2e15 <+5>: mov %rsp,%rbp
0x00007ffff7cc2e18 <+8>: push %r15
0x00007ffff7cc2e1a <+10>: push %r14
0x00007ffff7cc2e1c <+12>: lea -0x2f3(%rbp),%rdx
...
We see that offsets here are in decimal and offsets in the stack
backtrace are in hex. We need decimal, so
(gdb) p 0xca1
$1 = 3233
Let's find this offset in the disassembled function:
(gdb) disassemble/m 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)'
Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE:
Address range 0x7ffff7cc2e10 to 0x7ffff7cc3c4d:
837 {
0x00007ffff7cc2e10 <+0>: endbr64
0x00007ffff7cc2e14 <+4>: push %rbp
0x00007ffff7cc2e15 <+5>: mov %rsp,%rbp
...
963 if (ev >= 17) {
0x00007ffff7cc3a65 <+3157>: cmp $0x10,%r12w
0x00007ffff7cc3a6a <+3162>: je 0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>
964 decode(bal_rank_mask, p);
0x00007ffff7cc3a70 <+3168>: lea -0x2a4(%rbp),%rdx
0x00007ffff7cc3a77 <+3175>: mov $0x4,%esi
0x00007ffff7cc3a7c <+3180>: mov %r13,%rdi
0x00007ffff7cc3a7f <+3183>: lea 0x1c0(%rbx),%r14
965 }
966
967 if (ev >= 18) {
0x00007ffff7cc3ab1 <+3233>: cmp $0x11,%r12w
0x00007ffff7cc3ab6 <+3238>: je 0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>
968 decode(max_xattr_size, p);
969 }
970
971 if (ev >= 19) {
0x00007ffff7cc3ade <+3278>: cmp $0x12,%r12w
...
The return address is 0x00007ffff7cc3ab1 <+3233>, so we're looking for a call just before that.
The addresses here are not continuous, so it makes sense to look at the full disassembled version as well (i.e. disassemble without /m):
(gdb) disassemble 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)'
Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE:
Address range 0x7ffff7cc2e10 to 0x7ffff7cc3c4d:
0x00007ffff7cc2e10 <+0>: endbr64
0x00007ffff7cc2e14 <+4>: push %rbp
...
0x00007ffff7cc3a65 <+3157>: cmp $0x10,%r12w
0x00007ffff7cc3a6a <+3162>: je 0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>
0x00007ffff7cc3a70 <+3168>: lea -0x2a4(%rbp),%rdx
0x00007ffff7cc3a77 <+3175>: mov $0x4,%esi
0x00007ffff7cc3a7c <+3180>: mov %r13,%rdi
0x00007ffff7cc3a7f <+3183>: lea 0x1c0(%rbx),%r14
0x00007ffff7cc3a86 <+3190>: call 0x7ffff7a93320 <_ZN4ceph6buffer7v15_2_04list13iterator_implILb1EE4copyEjPc>
0x00007ffff7cc3a8b <+3195>: mov 0x1c0(%rbx),%rax
0x00007ffff7cc3a92 <+3202>: mov -0x2a4(%rbp),%esi
0x00007ffff7cc3a98 <+3208>: mov %r14,%rdx
0x00007ffff7cc3a9b <+3211>: mov %r13,%rdi
0x00007ffff7cc3a9e <+3214>: movq $0x0,0x1c8(%rbx)
0x00007ffff7cc3aa9 <+3225>: movb $0x0,(%rax)
0x00007ffff7cc3aac <+3228>: call 0x7ffff7a93400 <_ZN4ceph6buffer7v15_2_04list13iterator_implILb1EE4copyEjRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE>
0x00007ffff7cc3ab1 <+3233>: cmp $0x11,%r12w
...
The function that was called just before that has signature
_ZN4ceph6buffer7v15_2_04list13iterator_implILb1EE4copyEjRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE,
which is
(gdb) demangle _ZN4ceph6buffer7v15_2_04list13iterator_implILb1EE4copyEjRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
and it seems like it's the function that we see in the frame 9:
9: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned
int, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >&)+0x193) [0x749753093593]
so we're on the right track. The function is just after the
0x00007ffff7cc3a65 <+3157>: cmp $0x10,%r12w
0x00007ffff7cc3a6a <+3162>: je 0x7ffff7cc3371 <_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE+1377>
branch, which seems like this is the if part from
963 if (ev >= 17) {
964 decode(bal_rank_mask, p);
965 }
If the value equals 16 then the jump happens, otherwise decode(bal_rank_mask, p); is called.
bal_rank_mask is std::string, and the function has basic_string in the list of parameters, so it seems like we're still on the right track.
697 std::string bal_rank_mask = "-1";
As we see in the diff above
if (ev >= 17) {
- decode(max_xattr_size, p);
+ decode(bal_rank_mask, p);
}
if (ev >= 18) {
- decode(bal_rank_mask, p);
+ decode(max_xattr_size, p);
+ }
+
these two decode() calls were swapped. Let's find out why.
To do so we need to clone the upstream repo and run git blame on the file to see when and why the lines were changed:
> git clone https://github.com/ceph/ceph ceph-upstream
> cd ceph-upstream/
> git blame src/mds/MDSMap.cc
...
e134c8907013 (Yongseok Oh 2022-10-11 20:47:32 +0900 963) if (ev >= 17) {
78abfeaff27f (Patrick Donnelly 2024-02-15 10:28:32 -0500 964) decode(bal_rank_mask, p);
36ee8e7ed365 (Venky Shankar 2023-12-01 04:32:20 -0500 965) }
36ee8e7ed365 (Venky Shankar 2023-12-01 04:32:20 -0500 966)
36ee8e7ed365 (Venky Shankar 2023-12-01 04:32:20 -0500 967) if (ev >= 18) {
78abfeaff27f (Patrick Donnelly 2024-02-15 10:28:32 -0500 968) decode(max_xattr_size, p);
e134c8907013 (Yongseok Oh 2022-10-11 20:47:32 +0900 969) }
...
We see that both decode() functions where changed in the same commit. If
we look at it with
> git show 78abfeaff27f
we'll see that this is what we were looking for. A link to the commit:
https://github.com/ceph/ceph/commit/78abfeaff27fee343fb664db633de5b221699a73.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2089565
Title:
MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS
Status in ceph package in Ubuntu:
Confirmed
Bug description:
This issue is a continuation of
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2065515
On Ubuntu 24.04 lts we did upgrade Ceph to 19.2.0-0ubuntu0.24.04.1
Previous release is : 19.2.0~git20240301.4c76c50-0ubuntu6
whenever upgrading (tested on 2 different clusters) the ceph-mon
ends up crashing repeatedly with the below stack error
```
ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)
1: /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x788409245320]
2: pthread_kill()
3: gsignal()
4: abort()
5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa5ff5) [0x7884096a5ff5]
6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da) [0x7884096bb0da]
7: (std::unexpected()+0) [0x7884096a5a55]
8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb391) [0x7884096bb391]
9: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0x193) [0x78840a293593]
10: (MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xca1) [0x78840a4c3ab1]
11: (Filesystem::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1c3) [0x78840a4e4303]
12: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x280) [0x78840a4e6ef0]
13: (MDSMonitor::update_from_paxos(bool*)+0x291) [0x631ac5dea801]
14: (Monitor::refresh_from_paxos(bool*)+0x124) [0x631ac5b7a164]
15: (Monitor::preinit()+0x98e) [0x631ac5bb2fbe]
16: main()
17: /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x78840922a1ca]
18: __libc_start_main()
19: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
```
mitigation:
a rollback to the previous release 19.2.0~git20240301.4c76c50-0ubuntu6 is still possible to restore service
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2089565/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list