[Bug 1657444] Please test proposed package
Corey Bryant
corey.bryant at canonical.com
Thu Jan 18 13:40:14 UTC 2018
Hello likun, or anyone else affected,
Accepted python-oslo.messaging into pike-proposed. The package will
build now and be available in the Ubuntu Cloud Archive in a few hours,
and then in the -proposed repository.
Please help us by testing this new package. To enable the -proposed
repository:
sudo add-apt-repository cloud-archive:pike-proposed
sudo apt-get update
Your feedback will aid us getting this update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-pike-needed to verification-pike-done. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-pike-failed. In either case, details of your testing
will help us make a better decision.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance!
** Changed in: cloud-archive/pike
Status: Triaged => Fix Committed
** Tags added: verification-pike-needed
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-oslo.messaging in Ubuntu.
https://bugs.launchpad.net/bugs/1657444
Title:
Can't failover when rabbit_hosts is configured as 3 hosts
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive pike series:
Fix Committed
Status in oslo.messaging:
Fix Released
Status in python-oslo.messaging package in Ubuntu:
Invalid
Status in python-oslo.messaging source package in Artful:
Fix Committed
Bug description:
[Impact]
When the heartbeat connection times out it is not treated as a
recoverable error nor attempts to reconnect calling
ensure_connection(). This leaves the heartbeat thread attempting to
reconnect to the same host over and over again.
[Test Case]
* deploy openstack
bzr branch lp:openstack-charm-testing
cd openstack-charm-testing
juju deployer -c default.yaml -d -v artful-pike
juju add-unit rabbitmq-server
* Force timeout using iptables in a rabbitmq-server node
sudo iptables -I INPUT -p tcp --dport 5672 -j DROP
Expected result:
once the timeout happens, the heartbeat thread reconnects (picking a new rabbit host if needed).
Actual result:
the heartbeat thread is left in a loop (connect, socket closed, retry, connect...)
[Regression Potential]
Without this patch when the heartbeat connection times out, and it
does not attempt to connect to the next configured rabbit host. So the
risk is that situations where currently the daemons using this library
made it to reconnect to the same host (e.g. the disconnection from the
host is only for a few seconds) with this change they will reconnect
to the next host, so users may see the connections flapping between
two (or more) rabbit hosts.
[Other Info]
I have a rabbitmq cluster of 3 nodes
root at 47704165d2bb:/# rabbitmqctl cluster_status
Cluster status of node rabbit at 47704165d2bb ...
[{nodes,[{disc,[rabbit at 0482398a286e,rabbit at 3709521b608a,
rabbit at 47704165d2bb]}]},
{running_nodes,[rabbit at 0482398a286e,rabbit at 3709521b608a,rabbit at 47704165d2bb]},
{cluster_name,<<"rabbit at 47704165d2bb">>},
{partitions,[]},
{alarms,[{rabbit at 0482398a286e,[]},
{rabbit at 3709521b608a,[]},
{rabbit at 47704165d2bb,[]}]}]
root at 47704165d2bb:/# rabbitmqctl list_policies
Listing policies ...
/ ha-all all ^ha\\. {"ha-mode":"all"} 0
My oslo_message client configuration
[oslo_messaging_rabbit]
rabbit_hosts=120.0.0.56:5671,120.0.0.57:5671,120.0.0.55:5671
rabbit_userid=cloud
rabbit_password=cloud
rabbit_ha_queues=True
rabbit_retry_interval=1
rabbit_retry_backoff=2
rabbit_max_retries=0
rabbit_durable_queues=False
When I run "service rabbitmq-server stop" on one node to simulating a
failure, I got following error logs, and the consumer can't failover
from the bad node. It will reconnect the failure node forever instead
of other nodes. "kombu_failover_strategy" is default value of "round-
robin".
2009-01-13 18:32:42.785 17 ERROR oslo.messaging._drivers.impl_rabbit [-] [4e976d46-ceee-4617-b9be-5e4821990738] AMQP server 120.0.0.56:5671 closed the connection. Check login credentials: Socket closed
2009-01-13 18:32:43.819 17 ERROR oslo.messaging._drivers.impl_rabbit [-] Unable to connect to AMQP server on 120.0.0.56:5671 after None tries: Socket closed
2009-01-13 18:32:43.819 17 WARNING oslo.messaging._drivers.impl_rabbit [-] Unexpected error during heartbeart thread processing, retrying...
2009-01-13 18:32:58.874 17 ERROR oslo.messaging._drivers.impl_rabbit [-] [4e976d46-ceee-4617-b9be-5e4821990738] AMQP server 120.0.0.56:5671 closed the connection. Check login credentials: Socket closed
2009-01-13 18:32:59.907 17 ERROR oslo.messaging._drivers.impl_rabbit [-] Unable to connect to AMQP server on 120.0.0.56:5671 after None tries: Socket closed
2009-01-13 18:32:59.907 17 WARNING oslo.messaging._drivers.impl_rabbit [-] Unexpected error during heartbeart thread processing, retrying...
Who can help me. Thanks!
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1657444/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list