[Bug 1634989] Re: Segfault on rabbitmq-server start

Brian Murray brian at ubuntu.com
Thu Mar 30 23:35:01 UTC 2017


Hello bugproxy, or anyone else affected,

Accepted rabbitmq-server into yakkety-proposed. The package will build
now and be available at https://launchpad.net/ubuntu/+source/rabbitmq-
server/3.5.7-1ubuntu0.16.10.1 in a few hours, and then in the -proposed
repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: rabbitmq-server (Ubuntu Yakkety)
       Status: In Progress => Fix Committed

** Tags added: verification-needed

** Changed in: rabbitmq-server (Ubuntu Xenial)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to rabbitmq-server in Ubuntu.
https://bugs.launchpad.net/bugs/1634989

Title:
  Segfault on rabbitmq-server start

Status in rabbitmq-server package in Ubuntu:
  Fix Released
Status in rabbitmq-server source package in Xenial:
  Fix Committed
Status in rabbitmq-server source package in Yakkety:
  Fix Committed

Bug description:
  [Impact]

   * rabbitmq-server can segfault along codepath which happens to "open
  a port with the same fd multiple times".  Doing so is undefined (and
  unsafe in erlang, though segfaulting is unintentinal).

   * This only happens on specific versions of erlang, but the rabbitmq-
  server code is agreeably incorrect per erlang and has been fixed
  upstream.

  * This only affects xenial & yakkety.

  * The codepath belongs to an internal helper for writing to stderr,
  this prevents useful diagnostic information from being provided to a
  user.

  [Test Case]

   * Make sure your hostname resolves to something unreachable, I've
  selected 192.168.2.2, install rabbitmq-server, witness segfault.

   * # hostname blah
   * # echo "192.168.2.22 blah" >> /etc/hosts

   * # ping blah
  PING blah (192.168.122.2) 56(84) bytes of data.
  From x1 (192.168.122.90) icmp_seq=1 Destination Host Unreachable

   * # apt install rabbitmq-server
    ...
  Mar 22 22:12:41 blah systemd[1]: Starting RabbitMQ Messaging Server...
  Mar 22 22:12:42 blah rabbitmq[17995]: Waiting for rabbit at blah ...
  Mar 22 22:12:42 blah rabbitmq[17995]: pid is 18025 ...
  Mar 22 22:12:45 blah systemd[1]: rabbitmq-server.service: Main process exited, code=exited, status=1/FAILURE
  Mar 22 22:12:46 blah rabbitmq[17995]: Segmentation fault (core dumped)

    ...

   * Expected behavior would be not to segfault, and consequently print
  out a diagnostic message to stderr:

   * # dpkg -i rabbitmq-server_3.5.7-1ubuntu16.04.1_all.deb
    ...

  Mar 22 22:15:16 blah systemd[1]: Starting RabbitMQ Messaging Server...
  Mar 22 22:15:16 blah rabbitmq[18365]: Waiting for rabbit at blah ...
  Mar 22 22:15:16 blah rabbitmq[18365]: pid is 18386 ...
  Mar 22 22:15:19 blah systemd[1]: rabbitmq-server.service: Main process exited, code=exited, status=1/FAILURE
  Mar 22 22:15:20 blah rabbitmq[18365]: Error: process_not_running

    ...

  
   * Note: This just happens to be one error path that happens to hit the format_stderr() helper function.  

  [Regression Potential]

   * Limited to diagnostic messages path, so its really only seen when
  something is configured incorrectly.  That being said, any execution
  through this path today will segfault and without any diagnostic
  information to figure out what, so seems infinitely better.

   * This fix from upstream has been in place over a year without any
  issue, and was originally code that was working around buggy/flaking
  erlang library that has (according to upstream reports) been fixed
  since erlang 17, thus uneeded.

  
  [Other Info]
   
   * While the rabbitmq-server in trusty has this offending code, the version of erlang does not segfault.  Additionally, the fix provided by upstream is not necessarily sufficient on erlang < 17 that is in trusty, so I have not fixed it there. 

  * Zesty if already fixed.


  ---Problem Description---
  Starting rabbitmq-server triggers segfault.
  The segfault happens when the host is not reachable, for instance, which breaks the installation of rabbitmq-server package.
  It is comprehensible that an error must occur, but segfault should not be a default behaviour.
  This has been tested on 16.04 and 16.10, archs ppc64el and x86_64

  ---uname output---
  Linux vm1 4.8.0-22-generic #24-Ubuntu SMP Sat Oct 8 09:14:41 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

  ---Steps to Reproduce---
   #Better reproducible on a machine with 1 cpu

  root at yakkety:~# echo "192.168.1.1 blah" >> /etc/hosts
  root at yakkety:~# hostname blah
  root at yakkety:~# apt-get install rabbitmq-server
  Reading package lists... Done
  Building dependency tree
  Reading state information... Done
  The following NEW packages will be installed:
    rabbitmq-server
  0 upgraded, 1 newly installed, 0 to remove and 2 not upgraded.
  Need to get 0 B/4,251 kB of archives.
  After this operation, 5,243 kB of additional disk space will be used.
  Selecting previously unselected package rabbitmq-server.
  (Reading database ... 63962 files and directories currently installed.)
  Preparing to unpack .../rabbitmq-server_3.5.7-1_all.deb ...
  Unpacking rabbitmq-server (3.5.7-1) ...
  Processing triggers for ureadahead (0.100.0-19) ...
  Setting up rabbitmq-server (3.5.7-1) ...
  Created symlink /etc/systemd/system/multi-user.target.wants/rabbitmq-server.service ? /lib/systemd/system/rabbitmq-server.service.
  Job for rabbitmq-server.service failed because the control process exited with error code.
  See "systemctl status rabbitmq-server.service" and "journalctl -xe" for details.
  invoke-rc.d: initscript rabbitmq-server, action "start" failed.
  ? rabbitmq-server.service - RabbitMQ Messaging Server
     Loaded: loaded (/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2016-10-19 11:13:46 EDT; 7ms ago
    Process: 2818 ExecStartPost=/usr/lib/rabbitmq/bin/rabbitmq-server-wait (code=exited, status=139)
    Process: 2817 ExecStart=/usr/sbin/rabbitmq-server (code=exited, status=1/FAILURE)
   Main PID: 2817 (code=exited, status=1/FAILURE)

  Oct 19 11:13:13 blah systemd[1]: Starting RabbitMQ Messaging Server...
  Oct 19 11:13:13 blah rabbitmq[2818]: Waiting for rabbit at blah ...
  Oct 19 11:13:13 blah rabbitmq[2818]: pid is 2826 ...
  Oct 19 11:13:43 blah systemd[1]: rabbitmq-server.service: Main process exited, code=exited, status=1/FAILURE
  Oct 19 11:13:46 blah rabbitmq[2818]: Segmentation fault
  Oct 19 11:13:46 blah systemd[1]: rabbitmq-server.service: Control process exited, code=exited status=139
  Oct 19 11:13:46 blah systemd[1]: Failed to start RabbitMQ Messaging Server.
  Oct 19 11:13:46 blah systemd[1]: rabbitmq-server.service: Unit entered failed state.
  Oct 19 11:13:46 blah systemd[1]: rabbitmq-server.service: Failed with result 'exit-code'.
  dpkg: error processing package rabbitmq-server (--configure):
   subprocess installed post-installation script returned error exit status 1
  Processing triggers for systemd (231-9git1) ...
  Processing triggers for man-db (2.7.5-1) ...
  Processing triggers for ureadahead (0.100.0-19) ...
  Errors were encountered while processing:
   rabbitmq-server
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  root at yakkety:~# dmesg -T
  [Wed Oct 19 11:11:55 2016] async_10[2334]: unhandled signal 11 at 0000000000000000 nip 00000000206867bc lr 0000000020635648 code 30001
  [Wed Oct 19 11:13:02 2016] random: crng init done
  [Wed Oct 19 11:13:02 2016] systemd[1]: apt-daily.timer: Adding 3h 37min 32.381328s random time.
  [Wed Oct 19 11:13:02 2016] systemd[1]: apt-daily.timer: Adding 11h 5min 8.314218s random time.
  [Wed Oct 19 11:13:02 2016] systemd[1]: apt-daily.timer: Adding 11h 7min 37.045127s random time.
  [Wed Oct 19 11:13:03 2016] systemd[1]: apt-daily.timer: Adding 8h 43min 50.771575s random time.
  [Wed Oct 19 11:13:03 2016] systemd[1]: apt-daily.timer: Adding 2h 31min 33.179443s random time.
  [Wed Oct 19 11:13:04 2016] systemd[1]: apt-daily.timer: Adding 4h 22min 42.585438s random time.
  [Wed Oct 19 11:13:04 2016] systemd[1]: apt-daily.timer: Adding 36min 58.644429s random time.
  [Wed Oct 19 11:13:04 2016] systemd[1]: apt-daily.timer: Adding 9h 16min 4.769857s random time.
  [Wed Oct 19 11:13:12 2016] systemd[1]: apt-daily.timer: Adding 7h 48min 614.372ms random time.
  [Wed Oct 19 11:13:12 2016] systemd[1]: apt-daily.timer: Adding 3h 13min 41.779132s random time.
  [Wed Oct 19 11:13:12 2016] systemd[1]: apt-daily.timer: Adding 9h 39min 46.023823s random time.
  [Wed Oct 19 11:13:45 2016] async_10[2912]: unhandled signal 11 at 0000000000000000 nip 000000004f0d67bc lr 000000004f085648 code 30001
  [Wed Oct 19 11:13:45 2016] systemd[1]: apt-daily.timer: Adding 9h 5min 5.067674s random time.

  Userspace tool common name: rabbitmq-server

  The userspace tool has the following bit modes: 64

  Userspace package: rabbitmq-server

  I have just tested the patch in https://github.com/rabbitmq/rabbitmq-common/pull/54, which is present on v3.6.1 and prevents the segfault. The patch works and can be easily backported.
  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1634989/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list