[Bug 1393391] Re: neutron-openvswitch-agent stuck on no queue 'q-agent-notifier-port-update_fanout..

Corey Bryant corey.bryant at canonical.com
Fri Feb 26 20:52:44 UTC 2016


** Description changed:

  Under an HA deployment, neutron-openvswitch-agent can get stuck
  when receiving a close command on a fanout queue the agent is not subscribed to.
  
  It stops responding to any other messages, so it stops effectively
  working at all.
  
  2014-11-11 10:27:33.092 3027 INFO neutron.common.config [-] Logging enabled!
  2014-11-11 10:27:34.285 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672
  2014-11-11 10:27:34.370 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672
  2014-11-11 10:27:35.348 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent initialized successfully, now running...
  2014-11-11 10:27:35.351 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent out of sync with plugin!
  2014-11-11 10:27:35.401 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent tunnel out of sync with plugin!
  2014-11-11 10:27:35.414 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672
  2014-11-11 10:32:33.143 3027 INFO neutron.agent.securitygroups_rpc [req-22c7fa11-882d-4278-9f83-6dd56ab95ba4 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413']
  2014-11-11 10:58:11.916 3027 INFO neutron.agent.securitygroups_rpc [req-484fd71f-8f61-496c-aa8a-2d3abf8de365 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413']
  2014-11-11 10:59:43.954 3027 INFO neutron.agent.securitygroups_rpc [req-2c0bc777-04ed-470a-aec5-927a59100b89 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413']
  2014-11-11 11:00:22.500 3027 INFO neutron.agent.securitygroups_rpc [req-df447d01-d132-40f2-8528-1c1c4d57c0f5 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413']
  2014-11-12 01:27:35.662 3027 ERROR neutron.openstack.common.rpc.common [-] Failed to consume message from queue: Socket closed
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common Traceback (most recent call last):
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 579, in ensure
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     return method(*args, **kwargs)
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 659, in _consume
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     return self.connection.drain_events(timeout=timeout)
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 281, in drain_events
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     return self.transport.drain_events(self.connection, **kwargs)
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 94, in drain_events
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     return connection.drain_events(**kwargs)
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 266, in drain_events
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     chanmap, None, timeout=timeout,
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 328, in _wait_multiple
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     channel, method_sig, args, content = read_timeout(timeout)
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 292, in read_timeout
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     return self.method_reader.read_method()
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 192, in read_method
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common     raise m
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common IOError: Socket closed
  2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common
  2014-11-12 01:27:35.695 3027 INFO neutron.openstack.common.rpc.common [-] Reconnecting to AMQP server on vip-rabbitmq:5672
  2014-11-12 01:27:35.722 3027 INFO neutron.openstack.common.rpc.common [-] Connected to AMQP server on vip-rabbitmq:5672
  2014-11-12 02:00:22.682 3027 ERROR neutron.openstack.common.rpc.common [-] Failed to consume message from queue: Socket closed
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common Traceback (most recent call last):
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 579, in ensure
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     return method(*args, **kwargs)
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 659, in _consume
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     return self.connection.drain_events(timeout=timeout)
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 281, in drain_events
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     return self.transport.drain_events(self.connection, **kwargs)
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 94, in drain_events
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     return connection.drain_events(**kwargs)
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 266, in drain_events
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     chanmap, None, timeout=timeout,
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 328, in _wait_multiple
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     channel, method_sig, args, content = read_timeout(timeout)
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 292, in read_timeout
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     return self.method_reader.read_method()
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common   File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 192, in read_method
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common     raise m
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common IOError: Socket closed
  2014-11-12 02:00:22.682 3027 TRACE neutron.openstack.common.rpc.common
  2014-11-12 02:00:22.683 3027 INFO neutron.openstack.common.rpc.common [-] Reconnecting to AMQP server on vip-rabbitmq:5672
  2014-11-12 02:00:23.017 3027 INFO neutron.openstack.common.rpc.common [-] Connected to AMQP server on vip-rabbitmq:5672
  2014-11-12 02:00:23.021 3027 ERROR root [-] Unexpected exception occurred 1 time(s)... retrying.
  2014-11-12 02:00:23.021 3027 TRACE root Traceback (most recent call last):
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/excutils.py", line 92, in inner_func
  2014-11-12 02:00:23.021 3027 TRACE root     return infunc(*args, **kwargs)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 746, in _consumer_thread
  2014-11-12 02:00:23.021 3027 TRACE root     self.consume()
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 737, in consume
  2014-11-12 02:00:23.021 3027 TRACE root     six.next(it)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 664, in iterconsume
  2014-11-12 02:00:23.021 3027 TRACE root     yield self.ensure(_error_callback, _consume)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 579, in ensure
  2014-11-12 02:00:23.021 3027 TRACE root     return method(*args, **kwargs)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 657, in _consume
  2014-11-12 02:00:23.021 3027 TRACE root     queues_tail.consume(nowait=False)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 190, in consume
  2014-11-12 02:00:23.021 3027 TRACE root     self.queue.consume(*args, callback=_callback, **options)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 598, in consume
  2014-11-12 02:00:23.021 3027 TRACE root     nowait=nowait)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 1769, in basic_consume
  2014-11-12 02:00:23.021 3027 TRACE root     (60, 21),  # Channel.basic_consume_ok
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 71, in wait
  2014-11-12 02:00:23.021 3027 TRACE root     return self.dispatch_method(method_sig, args, content)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 88, in dispatch_method
  2014-11-12 02:00:23.021 3027 TRACE root     return amqp_method(self, args)
  2014-11-12 02:00:23.021 3027 TRACE root   File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 224, in _close
  2014-11-12 02:00:23.021 3027 TRACE root     raise ChannelError(reply_code, reply_text, (class_id, method_id))
  2014-11-12 02:00:23.021 3027 TRACE root ChannelError: 404: (NOT_FOUND - no queue 'q-agent-notifier-port-update_fanout_cc21f47607704321860757b7e6a1194a' in vhost '/', (60, 20), None)
  2014-11-12 02:00:23.021 3027 TRACE root
  2014-11-12 02:01:24.268 3027 ERROR root [-] Unexpected exception occurred 61 time(s)... retrying.
  2014-11-12 02:01:24.268 3027 TRACE root Traceback (most recent call last):
  2014-11-12 02:01:24.268 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/excutils.py", line 92, in inner_func
  2014-11-12 02:01:24.268 3027 TRACE root     return infunc(*args, **kwargs)
  2014-11-12 02:01:24.268 3027 TRACE root   File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 746, in _consumer_thread
  
- 
  ---------------------------
  
  [Impact]
  
  This patch addresses an issue under a RabbitMQ HA deployment where
  neutron-openvswitch-agent stuck on no queue 'q-agent-notifier-port-
  update_fanout_xx' error when one of the RabbitMQ cluster node goes down,
  if there are more than 100 nova compute nodes, all neutron agents are
  down which is awful, even restart neutron-openvswitch agent can solve
  it, it is not the idea reality to restart all of the agents on all
  compute nodes, it broke HA.
  
  [Test Case]
  
  Note steps are for trusty-icehouse, including neutron package
  1:2014.1.5-0ubuntu1.
  
  Deploy an OpenStack cloud w/ multiple rabbit nodes and then abruptly
  kill one of the rabbit nodes (e.g.  sudo service rabbitmq-server stop,
  etc). Observe that the neutron agents stopped to consume messages and
  keep throw no queue 'q-agent-notifier-port-update_fanout..' exception.
  
  [Regression Potential]
  
- None.
+ The regression potential is low.  The fix is fairly minimal and is
+ limited to the code path where a 404 error occurs.
  
  [Other Info]
  
  Oslo library has this fix, but due to Neutron is using kombu other than
  oslo library in Icehouse, it still suffer this issue.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1393391

Title:
  neutron-openvswitch-agent stuck on no queue 'q-agent-notifier-port-
  update_fanout..

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1393391/+subscriptions



More information about the Ubuntu-server-bugs mailing list