[Bug 1890334] Re: ceph: nautilus: backport fixes for msgr/eventcenter

Thu Aug 6 16:55:29 UTC 2020

** Description changed:

+ [Impact]
+ 
+  * Ceph Nautilus/14 may hit daemon crashes in msgr/eventcenter
+    as it lacks backport fixes to properly protect many threads.
+    
+  * Once a daemon crash occurs, the cluster becomes HEALTH_WARN,
+    and reports in status: "N daemons have recently crashed"
+ 
+ [Fix]
+ 
+  * The backport patches in Ceph PR #33820 [1] fix this problem.
+  
+  * There are 8 patches in it, but only 5 are strictly required
+    (3 are related to testcases/sanitizers, not used in package),
+    and 1 is already applied; so actually only 4 patches needed
+    (the 'msg/async:' patches.)
+ 
+   [1] https://github.com/ceph/ceph/pull/33820
+ 
+ [Test Case]
+ 
+  * The test-case patch in the PR is a reliable reproducer; it
+    can be applied then built with -DWITH_TESTS=ON in d/rules;
+    found in 'obj-x86_64-linux-gnu/bin/ceph_test_rados_api_misc'
+ 
+  * On a test ceph cluster (e.g., 1 MON, 3 OSDs) in the mon node:
+  
+    $ sudo LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ceph/ \
+     ./ceph_test_rados_api_misc --gtest_filter=LibRadosMisc.ShutdownRace
+ 
+  * This hits segfaults with the stack traces seen by the reporter,
+    and other traces as well in the original package, and no errors
+    in the patched package.
+    
+  * Attached the test-case binary 'ceph_test_rados_api_misc' and 
+    the juju bundle for the test ceph cluster 'ceph-lp1890334.yaml'.
+ 
+ [Regression Potential]
+ 
+  * These patches change the connection close/reset/reuse logic,
+    so regressions would likely manifest in such functions but
+    be exposed/hit errors actually in daemon communication.
+ 
+  * There are no further related fixes upstream.
+ 
+ [Other Info]
+ 
+  * Patches already available on Ceph Octopus/15 on Focal.
+  * Not reporting against Eoan (Train) as it is EOL.
+     
+ [Original Description]
+ 
  Ceph Nautilus in bionic-train may hit daemon crashes (e.g., ceph-mgr)
  in msgr/eventcenter as it lacks the following set of fixes backports:

-   https://github.com/ceph/ceph/pull/33820
+   https://github.com/ceph/ceph/pull/33820

  Reporting the bug against UCA since Ubuntu Eoan (Train) is EOL.
  Working on the debdiffs and tests.

  Example stack trace as reported by 'ceph crash info' and GDB:

  $ sudo ceph crash info <crash ID>
  ...
-     "process_name": "ceph-mgr",
+     "process_name": "ceph-mgr",
  ...
-     "backtrace": [
-         "(()+0x128a0) [0x7f8e4ae928a0]",
-         "(bool ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame>(ceph::msgr::v2::MessageFrame&)+0x48a) [0x7f8e4bf4219a]",
-         "(ProtocolV2::write_message(Message*, bool)+0x4dd) [0x7f8e4bf249dd]",
-         "(ProtocolV2::write_event()+0x2c5) [0x7f8e4bf39d55]",
-         "(AsyncConnection::handle_write()+0x43) [0x7f8e4bef89e3]",
-         "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xd57) [0x7f8e4bf51157]",
-         "(()+0x59b848) [0x7f8e4bf55848]",
-         "(()+0xbd6df) [0x7f8e4a9b06df]",
-         "(()+0x76db) [0x7f8e4ae876db]",
-         "(clone()+0x3f) [0x7f8e4a06da3f]"
-     ]
+     "backtrace": [
+         "(()+0x128a0) [0x7f8e4ae928a0]",
+         "(bool ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame>(ceph::msgr::v2::MessageFrame&)+0x48a) [0x7f8e4bf4219a]",
+         "(ProtocolV2::write_message(Message*, bool)+0x4dd) [0x7f8e4bf249dd]",
+         "(ProtocolV2::write_event()+0x2c5) [0x7f8e4bf39d55]",
+         "(AsyncConnection::handle_write()+0x43) [0x7f8e4bef89e3]",
+         "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xd57) [0x7f8e4bf51157]",
+         "(()+0x59b848) [0x7f8e4bf55848]",
+         "(()+0xbd6df) [0x7f8e4a9b06df]",
+         "(()+0x76db) [0x7f8e4ae876db]",
+         "(clone()+0x3f) [0x7f8e4a06da3f]"
+     ]
  ...

  (gdb) bt
  #0  raise (sig=sig at entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x000055b9deda9140 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:81
  #2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:326
  #3  <signal handler called>
  #4  ceph::msgr::v2::Frame<ceph::msgr::v2::MessageFrame, (unsigned short)8, (unsigned short)8, (unsigned short)8, (unsigned short)4096>::get_buffer (session_stream_handlers=..., this=<optimized out>) at ./src/msg/async/frames_v2.h:273
  #5  ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame> (this=this at entry=0x55b9e4830680, frame=...) at ./src/msg/async/ProtocolV2.cc:552
  #6  0x00007f8e4bf249dd in ProtocolV2::write_message (this=this at entry=0x55b9e4830680, m=m at entry=0x55b9e596da40, more=more at entry=false)
-     at ./src/msg/async/ProtocolV2.cc:515
+     at ./src/msg/async/ProtocolV2.cc:515
  #7  0x00007f8e4bf39d55 in ProtocolV2::write_event (this=0x55b9e4830680) at ./src/msg/async/ProtocolV2.cc:627
  #8  0x00007f8e4bef89e3 in AsyncConnection::handle_write (this=0x55b9e73ec480) at ./src/msg/async/AsyncConnection.cc:692
- #9  0x00007f8e4bf51157 in EventCenter::process_events (this=this at entry=0x55b9e05502c0, timeout_microseconds=<optimized out>, 
-     timeout_microseconds at entry=30000000, working_dur=working_dur at entry=0x7f8e466d5828) at ./src/msg/async/Event.cc:441
+ #9  0x00007f8e4bf51157 in EventCenter::process_events (this=this at entry=0x55b9e05502c0, timeout_microseconds=<optimized out>,
+     timeout_microseconds at entry=30000000, working_dur=working_dur at entry=0x7f8e466d5828) at ./src/msg/async/Event.cc:441
  #10 0x00007f8e4bf55848 in NetworkStack::<lambda()>::operator() (__closure=0x55b9e05feff8) at ./src/msg/async/Stack.cc:53
  #11 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
-     at /usr/include/c++/7/bits/std_function.h:316
+     at /usr/include/c++/7/bits/std_function.h:316
  #12 0x00007f8e4a9b06df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  #13 0x00007f8e4ae876db in start_thread (arg=0x7f8e466d8700) at pthread_create.c:463
  #14 0x00007f8e4a06da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

** Description changed:

  [Impact]

-  * Ceph Nautilus/14 may hit daemon crashes in msgr/eventcenter
-    as it lacks backport fixes to properly protect many threads.
-    
-  * Once a daemon crash occurs, the cluster becomes HEALTH_WARN,
-    and reports in status: "N daemons have recently crashed"
+  * Ceph Nautilus/14 may hit daemon crashes in msgr/eventcenter
+    as it lacks backport fixes to properly protect many threads
+    in the connection close/reset/reuse paths.
+ 
+  * Once a daemon crash occurs, the cluster becomes HEALTH_WARN,
+    and reports in status: "N daemons have recently crashed"

  [Fix]

-  * The backport patches in Ceph PR #33820 [1] fix this problem.
-  
-  * There are 8 patches in it, but only 5 are strictly required
-    (3 are related to testcases/sanitizers, not used in package),
-    and 1 is already applied; so actually only 4 patches needed
-    (the 'msg/async:' patches.)
+  * The backport patches in Ceph PR #33820 [1] fix this problem.

-   [1] https://github.com/ceph/ceph/pull/33820
+  * There are 8 patches in it, but only 5 are strictly required
+    (3 are related to testcases/sanitizers, not used in package),
+    and 1 is already applied; so actually only 4 patches needed
+    (the 'msg/async:' patches.)
+ 
+   [1] https://github.com/ceph/ceph/pull/33820

  [Test Case]

-  * The test-case patch in the PR is a reliable reproducer; it
-    can be applied then built with -DWITH_TESTS=ON in d/rules;
-    found in 'obj-x86_64-linux-gnu/bin/ceph_test_rados_api_misc'
+  * The test-case patch in the PR is a reliable reproducer; it
+    can be applied then built with -DWITH_TESTS=ON in d/rules;
+    found in 'obj-x86_64-linux-gnu/bin/ceph_test_rados_api_misc'

-  * On a test ceph cluster (e.g., 1 MON, 3 OSDs) in the mon node:
-  
-    $ sudo LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ceph/ \
-     ./ceph_test_rados_api_misc --gtest_filter=LibRadosMisc.ShutdownRace
+  * On a test ceph cluster (e.g., 1 MON, 3 OSDs) in the mon node:

-  * This hits segfaults with the stack traces seen by the reporter,
-    and other traces as well in the original package, and no errors
-    in the patched package.
-    
-  * Attached the test-case binary 'ceph_test_rados_api_misc' and 
-    the juju bundle for the test ceph cluster 'ceph-lp1890334.yaml'.
+    $ sudo LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ceph/ \
+     ./ceph_test_rados_api_misc --gtest_filter=LibRadosMisc.ShutdownRace
+ 
+  * This hits segfaults with the stack traces seen by the reporter,
+    and other traces as well in the original package, and no errors
+    in the patched package.
+ 
+  * Attached the test-case binary 'ceph_test_rados_api_misc' and
+    the juju bundle for the test ceph cluster 'ceph-lp1890334.yaml'.

  [Regression Potential]

-  * These patches change the connection close/reset/reuse logic,
-    so regressions would likely manifest in such functions but
-    be exposed/hit errors actually in daemon communication.
+  * These patches change the connection close/reset/reuse logic,
+    so regressions would likely manifest in such functions but
+    be exposed/hit errors actually in daemon communication.

-  * There are no further related fixes upstream.
+  * There are no further related fixes upstream.

  [Other Info]

-  * Patches already available on Ceph Octopus/15 on Focal.
-  * Not reporting against Eoan (Train) as it is EOL.
-     
+  * Patches already available on Ceph Octopus/15 on Focal.
+  * Not reporting against Eoan (Train) as it is EOL.
+ 
  [Original Description]

  Ceph Nautilus in bionic-train may hit daemon crashes (e.g., ceph-mgr)
  in msgr/eventcenter as it lacks the following set of fixes backports:

    https://github.com/ceph/ceph/pull/33820

  Reporting the bug against UCA since Ubuntu Eoan (Train) is EOL.
  Working on the debdiffs and tests.

  Example stack trace as reported by 'ceph crash info' and GDB:

  $ sudo ceph crash info <crash ID>
  ...
      "process_name": "ceph-mgr",
  ...
      "backtrace": [
          "(()+0x128a0) [0x7f8e4ae928a0]",
          "(bool ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame>(ceph::msgr::v2::MessageFrame&)+0x48a) [0x7f8e4bf4219a]",
          "(ProtocolV2::write_message(Message*, bool)+0x4dd) [0x7f8e4bf249dd]",
          "(ProtocolV2::write_event()+0x2c5) [0x7f8e4bf39d55]",
          "(AsyncConnection::handle_write()+0x43) [0x7f8e4bef89e3]",
          "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xd57) [0x7f8e4bf51157]",
          "(()+0x59b848) [0x7f8e4bf55848]",
          "(()+0xbd6df) [0x7f8e4a9b06df]",
          "(()+0x76db) [0x7f8e4ae876db]",
          "(clone()+0x3f) [0x7f8e4a06da3f]"
      ]
  ...

  (gdb) bt
  #0  raise (sig=sig at entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x000055b9deda9140 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:81
  #2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:326
  #3  <signal handler called>
  #4  ceph::msgr::v2::Frame<ceph::msgr::v2::MessageFrame, (unsigned short)8, (unsigned short)8, (unsigned short)8, (unsigned short)4096>::get_buffer (session_stream_handlers=..., this=<optimized out>) at ./src/msg/async/frames_v2.h:273
  #5  ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame> (this=this at entry=0x55b9e4830680, frame=...) at ./src/msg/async/ProtocolV2.cc:552
  #6  0x00007f8e4bf249dd in ProtocolV2::write_message (this=this at entry=0x55b9e4830680, m=m at entry=0x55b9e596da40, more=more at entry=false)
      at ./src/msg/async/ProtocolV2.cc:515
  #7  0x00007f8e4bf39d55 in ProtocolV2::write_event (this=0x55b9e4830680) at ./src/msg/async/ProtocolV2.cc:627
  #8  0x00007f8e4bef89e3 in AsyncConnection::handle_write (this=0x55b9e73ec480) at ./src/msg/async/AsyncConnection.cc:692
  #9  0x00007f8e4bf51157 in EventCenter::process_events (this=this at entry=0x55b9e05502c0, timeout_microseconds=<optimized out>,
      timeout_microseconds at entry=30000000, working_dur=working_dur at entry=0x7f8e466d5828) at ./src/msg/async/Event.cc:441
  #10 0x00007f8e4bf55848 in NetworkStack::<lambda()>::operator() (__closure=0x55b9e05feff8) at ./src/msg/async/Stack.cc:53
  #11 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
      at /usr/include/c++/7/bits/std_function.h:316
  #12 0x00007f8e4a9b06df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  #13 0x00007f8e4ae876db in start_thread (arg=0x7f8e466d8700) at pthread_create.c:463
  #14 0x00007f8e4a06da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

** Description changed:

  [Impact]

   * Ceph Nautilus/14 may hit daemon crashes in msgr/eventcenter
     as it lacks backport fixes to properly protect many threads
-    in the connection close/reset/reuse paths.
+    in the connection close/reset/reuse paths.

   * Once a daemon crash occurs, the cluster becomes HEALTH_WARN,
     and reports in status: "N daemons have recently crashed"
+ 
+  * Example:
+ 	
+     $ juju run --unit ceph-mon/0 "sudo ceph -s"
+     cluster:
+     id: ...
+     health: HEALTH_WARN
+     1 daemons have recently crashed

  [Fix]

   * The backport patches in Ceph PR #33820 [1] fix this problem.

   * There are 8 patches in it, but only 5 are strictly required
     (3 are related to testcases/sanitizers, not used in package),
     and 1 is already applied; so actually only 4 patches needed
     (the 'msg/async:' patches.)

    [1] https://github.com/ceph/ceph/pull/33820

  [Test Case]

   * The test-case patch in the PR is a reliable reproducer; it
     can be applied then built with -DWITH_TESTS=ON in d/rules;
     found in 'obj-x86_64-linux-gnu/bin/ceph_test_rados_api_misc'

   * On a test ceph cluster (e.g., 1 MON, 3 OSDs) in the mon node:

     $ sudo LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ceph/ \
      ./ceph_test_rados_api_misc --gtest_filter=LibRadosMisc.ShutdownRace

   * This hits segfaults with the stack traces seen by the reporter,
     and other traces as well in the original package, and no errors
     in the patched package.

   * Attached the test-case binary 'ceph_test_rados_api_misc' and
     the juju bundle for the test ceph cluster 'ceph-lp1890334.yaml'.
- 
+  
  [Regression Potential]

   * These patches change the connection close/reset/reuse logic,
     so regressions would likely manifest in such functions but
     be exposed/hit errors actually in daemon communication.

   * There are no further related fixes upstream.

  [Other Info]

   * Patches already available on Ceph Octopus/15 on Focal.
   * Not reporting against Eoan (Train) as it is EOL.

  [Original Description]

  Ceph Nautilus in bionic-train may hit daemon crashes (e.g., ceph-mgr)
  in msgr/eventcenter as it lacks the following set of fixes backports:

    https://github.com/ceph/ceph/pull/33820

  Reporting the bug against UCA since Ubuntu Eoan (Train) is EOL.
  Working on the debdiffs and tests.

  Example stack trace as reported by 'ceph crash info' and GDB:

  $ sudo ceph crash info <crash ID>
  ...
      "process_name": "ceph-mgr",
  ...
      "backtrace": [
          "(()+0x128a0) [0x7f8e4ae928a0]",
          "(bool ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame>(ceph::msgr::v2::MessageFrame&)+0x48a) [0x7f8e4bf4219a]",
          "(ProtocolV2::write_message(Message*, bool)+0x4dd) [0x7f8e4bf249dd]",
          "(ProtocolV2::write_event()+0x2c5) [0x7f8e4bf39d55]",
          "(AsyncConnection::handle_write()+0x43) [0x7f8e4bef89e3]",
          "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xd57) [0x7f8e4bf51157]",
          "(()+0x59b848) [0x7f8e4bf55848]",
          "(()+0xbd6df) [0x7f8e4a9b06df]",
          "(()+0x76db) [0x7f8e4ae876db]",
          "(clone()+0x3f) [0x7f8e4a06da3f]"
      ]
  ...

  (gdb) bt
  #0  raise (sig=sig at entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x000055b9deda9140 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:81
  #2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:326
  #3  <signal handler called>
  #4  ceph::msgr::v2::Frame<ceph::msgr::v2::MessageFrame, (unsigned short)8, (unsigned short)8, (unsigned short)8, (unsigned short)4096>::get_buffer (session_stream_handlers=..., this=<optimized out>) at ./src/msg/async/frames_v2.h:273
  #5  ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame> (this=this at entry=0x55b9e4830680, frame=...) at ./src/msg/async/ProtocolV2.cc:552
  #6  0x00007f8e4bf249dd in ProtocolV2::write_message (this=this at entry=0x55b9e4830680, m=m at entry=0x55b9e596da40, more=more at entry=false)
      at ./src/msg/async/ProtocolV2.cc:515
  #7  0x00007f8e4bf39d55 in ProtocolV2::write_event (this=0x55b9e4830680) at ./src/msg/async/ProtocolV2.cc:627
  #8  0x00007f8e4bef89e3 in AsyncConnection::handle_write (this=0x55b9e73ec480) at ./src/msg/async/AsyncConnection.cc:692
  #9  0x00007f8e4bf51157 in EventCenter::process_events (this=this at entry=0x55b9e05502c0, timeout_microseconds=<optimized out>,
      timeout_microseconds at entry=30000000, working_dur=working_dur at entry=0x7f8e466d5828) at ./src/msg/async/Event.cc:441
  #10 0x00007f8e4bf55848 in NetworkStack::<lambda()>::operator() (__closure=0x55b9e05feff8) at ./src/msg/async/Stack.cc:53
  #11 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
      at /usr/include/c++/7/bits/std_function.h:316
  #12 0x00007f8e4a9b06df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  #13 0x00007f8e4ae876db in start_thread (arg=0x7f8e466d8700) at pthread_create.c:463
  #14 0x00007f8e4a06da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1890334

Title:
  ceph: nautilus: backport fixes for msgr/eventcenter

Status in Ubuntu Cloud Archive:
  In Progress

Bug description:
  [Impact]

   * Ceph Nautilus/14 may hit daemon crashes in msgr/eventcenter
     as it lacks backport fixes to properly protect many threads
     in the connection close/reset/reuse paths.

   * Once a daemon crash occurs, the cluster becomes HEALTH_WARN,
     and reports in status: "N daemons have recently crashed"

   * Example:

      $ juju run --unit ceph-mon/0 "sudo ceph -s"
      cluster:
      id: ...
      health: HEALTH_WARN
      1 daemons have recently crashed

  [Fix]

   * The backport patches in Ceph PR #33820 [1] fix this problem.

   * There are 8 patches in it, but only 5 are strictly required
     (3 are related to testcases/sanitizers, not used in package),
     and 1 is already applied; so actually only 4 patches needed
     (the 'msg/async:' patches.)

    [1] https://github.com/ceph/ceph/pull/33820

  [Test Case]

   * The test-case patch in the PR is a reliable reproducer; it
     can be applied then built with -DWITH_TESTS=ON in d/rules;
     found in 'obj-x86_64-linux-gnu/bin/ceph_test_rados_api_misc'

   * On a test ceph cluster (e.g., 1 MON, 3 OSDs) in the mon node:

     $ sudo LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ceph/ \
      ./ceph_test_rados_api_misc --gtest_filter=LibRadosMisc.ShutdownRace

   * This hits segfaults with the stack traces seen by the reporter,
     and other traces as well in the original package, and no errors
     in the patched package.

   * Attached the test-case binary 'ceph_test_rados_api_misc' and
     the juju bundle for the test ceph cluster 'ceph-lp1890334.yaml'.

  [Regression Potential]

   * These patches change the connection close/reset/reuse logic,
     so regressions would likely manifest in such functions but
     be exposed/hit errors actually in daemon communication.

   * There are no further related fixes upstream.

  [Other Info]

   * Patches already available on Ceph Octopus/15 on Focal.
   * Not reporting against Eoan (Train) as it is EOL.

  [Original Description]

  Ceph Nautilus in bionic-train may hit daemon crashes (e.g., ceph-mgr)
  in msgr/eventcenter as it lacks the following set of fixes backports:

    https://github.com/ceph/ceph/pull/33820

  Reporting the bug against UCA since Ubuntu Eoan (Train) is EOL.
  Working on the debdiffs and tests.

  Example stack trace as reported by 'ceph crash info' and GDB:

  $ sudo ceph crash info <crash ID>
  ...
      "process_name": "ceph-mgr",
  ...
      "backtrace": [
          "(()+0x128a0) [0x7f8e4ae928a0]",
          "(bool ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame>(ceph::msgr::v2::MessageFrame&)+0x48a) [0x7f8e4bf4219a]",
          "(ProtocolV2::write_message(Message*, bool)+0x4dd) [0x7f8e4bf249dd]",
          "(ProtocolV2::write_event()+0x2c5) [0x7f8e4bf39d55]",
          "(AsyncConnection::handle_write()+0x43) [0x7f8e4bef89e3]",
          "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xd57) [0x7f8e4bf51157]",
          "(()+0x59b848) [0x7f8e4bf55848]",
          "(()+0xbd6df) [0x7f8e4a9b06df]",
          "(()+0x76db) [0x7f8e4ae876db]",
          "(clone()+0x3f) [0x7f8e4a06da3f]"
      ]
  ...

  (gdb) bt
  #0  raise (sig=sig at entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x000055b9deda9140 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:81
  #2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:326
  #3  <signal handler called>
  #4  ceph::msgr::v2::Frame<ceph::msgr::v2::MessageFrame, (unsigned short)8, (unsigned short)8, (unsigned short)8, (unsigned short)4096>::get_buffer (session_stream_handlers=..., this=<optimized out>) at ./src/msg/async/frames_v2.h:273
  #5  ProtocolV2::append_frame<ceph::msgr::v2::MessageFrame> (this=this at entry=0x55b9e4830680, frame=...) at ./src/msg/async/ProtocolV2.cc:552
  #6  0x00007f8e4bf249dd in ProtocolV2::write_message (this=this at entry=0x55b9e4830680, m=m at entry=0x55b9e596da40, more=more at entry=false)
      at ./src/msg/async/ProtocolV2.cc:515
  #7  0x00007f8e4bf39d55 in ProtocolV2::write_event (this=0x55b9e4830680) at ./src/msg/async/ProtocolV2.cc:627
  #8  0x00007f8e4bef89e3 in AsyncConnection::handle_write (this=0x55b9e73ec480) at ./src/msg/async/AsyncConnection.cc:692
  #9  0x00007f8e4bf51157 in EventCenter::process_events (this=this at entry=0x55b9e05502c0, timeout_microseconds=<optimized out>,
      timeout_microseconds at entry=30000000, working_dur=working_dur at entry=0x7f8e466d5828) at ./src/msg/async/Event.cc:441
  #10 0x00007f8e4bf55848 in NetworkStack::<lambda()>::operator() (__closure=0x55b9e05feff8) at ./src/msg/async/Stack.cc:53
  #11 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
      at /usr/include/c++/7/bits/std_function.h:316
  #12 0x00007f8e4a9b06df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  #13 0x00007f8e4ae876db in start_thread (arg=0x7f8e466d8700) at pthread_create.c:463
  #14 0x00007f8e4a06da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1890334/+subscriptions