[Bug 1772236] Re: rabbit died and everything else died
Iain Lane
iain at orangesquash.org.uk
Fri Jun 8 10:35:59 UTC 2018
Worth noting that we ran for many months before hitting this, and then:
ubuntu at juju-prod-ues-proposed-migration-machine-1:~$ dmesg -T | grep "Out of memory: Kill" | uniq
[Wed May 9 16:06:23 2018] Out of memory: Kill process 1408 (beam) score 215 or sacrifice child
[Sun May 20 03:58:24 2018] Out of memory: Kill process 19495 (beam) score 437 or sacrifice child
[Fri Jun 1 02:28:15 2018] Out of memory: Kill process 6569 (beam) score 428 or sacrifice child
[Fri Jun 8 05:37:54 2018] Out of memory: Kill process 1142 (beam) score 434 or sacrifice child
4 times in a month.
One thing that happened "sort of" around this time is
https://launchpad.net/ubuntu/+source/erlang/1:18.3-dfsg-1ubuntu3.1
but I have a vague correlation here, no causation.
Does seem like rabbit's memory usage grows over time until it's
eventually killed.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to rabbitmq-server in Ubuntu.
https://bugs.launchpad.net/bugs/1772236
Title:
rabbit died and everything else died
Status in Auto Package Testing:
New
Status in rabbitmq-server package in Ubuntu:
New
Bug description:
Why did it die?
Should it have self-restarted?
ubuntu at juju-prod-ues-proposed-migration-machine-1:~$ journalctl -u rabbitmq-server.service -n1000 | cat
-- Logs begin at Sun 2018-05-20 00:18:25 UTC, end at Sun 2018-05-20 08:58:27 UTC. --
May 20 04:00:11 juju-prod-ues-proposed-migration-machine-1 systemd[1]: rabbitmq-server.service: Main process exited, code=exited, status=137/n/a
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: Stopping and halting node 'rabbit at ps45-10-25-180-146' ...
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: Error: unable to connect to node 'rabbit at ps45-10-25-180-146': nodedown
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: DIAGNOSTICS
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: ===========
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: attempted to contact: ['rabbit at ps45-10-25-180-146']
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: rabbit at ps45-10-25-180-146:
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: * connected to epmd (port 4369) on ps45-10-25-180-146
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: * epmd reports: node 'rabbit' not running at all
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: other nodes on ps45-10-25-180-146: ['rabbitmq-cli-28979']
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: * suggestion: start the node
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: current node details:
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: - node name: 'rabbitmq-cli-28979 at juju-prod-ues-proposed-migration-machine-1'
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: - home dir: .
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 rabbitmq[28971]: - cookie hash: 7+AChRZDewWFJK8SEUhx+Q==
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 systemd[1]: rabbitmq-server.service: Control process exited, code=exited status=2
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 systemd[1]: rabbitmq-server.service: Unit entered failed state.
May 20 04:00:12 juju-prod-ues-proposed-migration-machine-1 systemd[1]: rabbitmq-server.service: Failed with result 'exit-code'.
To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/1772236/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list