[Bug 653405] Re: rabbitmq-server fails to start if hostname is unresolvable

Clint Byrum clint at fewbar.com
Wed Dec 15 19:08:44 GMT 2010


In a discussion in #rabbitmq on Freenode, the users there informed me
that there are actually two problems at play here.

Comment #3's log indicates that something was already listening on
rabbitmq's port, presumably rabbitmq. It does not indicate that there
was a failure looking up hostname.

What really seems to be the big issue is that if hostname changes,
rabbitmq cannot be restarted, nor can it be effectively queried for
status. This is because somewhere in /var/lib/rabbitmq, the hostname is
stored.

This actually means that any time hostname changes, rabbitmq cannot be
restarted without clearing all of its persistent storage.

Ouch.

Its not clear that we can fix that, but we can at least make the init.d
script provide some useful warnings on the conditions that:

* hostname has changed from the original (rabbitmqctl status will predictibly fail in this instance)
* hostname is unresolvable (a simple host lookup will do)

Instructing users to change the hostname back, and/or change it to
something resolvable, would probably be the best way to go.

For now, a workaround is to run

rabbitmqctl status

If this fails similar to this message:
# rabbitmqctl status
Status of node rabbit at foo ...
Error: unable to connect to node rabbit at foo: nodedown
diagnostics:
- unable to connect to epmd on foo: nxdomain
- current node: rabbitmqctl612 at foo
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==

The hostname is probably unresolvable.. however, if there is at least a
listing of 'nodes and their ports on XXX', like this:

# rabbitmqctl status
Status of node rabbit at localhost ...
Error: unable to connect to node rabbit at localhost: nodedown
diagnostics:
- nodes and their ports on localhost: [{rabbitmqctl640,53615}]
- current node: rabbitmqctl640 at localhost
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==


Then the hostname is resolvable, but rabbitmq is not running.

If, however, you have changed your hostname, you will get this error:

# rabbitmqctl status
Status of node 'rabbit at clint-MacBookPro' ...
Error: unable to connect to node 'rabbit at clint-MacBookPro': nodedown
diagnostics:

=ERROR REPORT==== 15-Dec-2010::11:06:16 ===
Error in process <0.36.0> on node 'rabbitmqctl781 at clint-MacBookPro' with exit value: {badarg,[{erlang,list_to_existing_atom,["rabbit at localhost"]},{dist_util,recv_challenge,1},{dist_util,handshake_we_started,1}]}

- nodes and their ports on clint-MacBookPro: [{rabbit,53426},
                                              {rabbitmqctl781,56544}]
- current node: 'rabbitmqctl781 at clint-MacBookPro'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==


Note that it mentions list_to_existing_atom, ["rabbit at localhost"] .. after @ is the old node name, and the one that you must set your hostname to in order to start rabbitmq again. I will set about trying to hack that into the init.d script to at least give people a fighting chance to correct their hostname.

Also, if you try to start rabbitmq and you get TIMEOUT .. thats caused
by having changed the hostname as well


** Summary changed:

- rabbitmq-server fails to start if hostname is unresolvable
+ rabbitmq-server fails to start if hostname is unresolvable or has changed since first starting

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to rabbitmq-server in ubuntu.
https://bugs.launchpad.net/bugs/653405

Title:
  rabbitmq-server fails to start if hostname is unresolvable or has changed since first starting



More information about the Ubuntu-server-bugs mailing list