[Bug 1662804] Re: [SRU] Agent is failing to process HA router if initialize() fails
OpenStack Infra
1662804 at bugs.launchpad.net
Fri May 26 02:58:29 UTC 2017
Reviewed: https://review.openstack.org/452100
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=b98267f73af5a6c6388a76a73d88e46c90f8a71e
Submitter: Jenkins
Branch: stable/newton
commit b98267f73af5a6c6388a76a73d88e46c90f8a71e
Author: venkata anil <anilvenkata at redhat.com>
Date: Wed Feb 8 15:49:47 2017 +0000
Avoid router ri.process if initialize() fails
When router_info initialize() fails(with trace) some resources(
like keepalived process) may not be created. While handling this
exception, l3 agent calls _process_updated_router instead of
again calling _process_added_router, which also fails trying to
access resources which are not created.
In this change, agent will have new router_info(i.e
self.router_info[router_id] = ri) only when initialize() succeeds.
When initialize() fails, as router_info is not part of agent,
"_process_router_if_compatible" will again call initialize().
We also cleanup router_info when initialize() fails.
Closes-bug: #1662804
Change-Id: I278ac83de57713c93d6e50846d79034d774c5d47
(cherry picked from commit 3e1ed94e389c427f1da56cde43a458832078f073)
(cherry picked from commit 71c0e8940661fefbe2830258509e6c4afb887783)
** Tags added: in-stable-newton
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1662804
Title:
[SRU] Agent is failing to process HA router if initialize() fails
Status in Ubuntu Cloud Archive:
Fix Released
Status in neutron:
Fix Released
Status in neutron package in Ubuntu:
Fix Released
Bug description:
[Impact]
This patch resolves, amongst other things, issues with a create and
delete router request race condition when using l3 HA. At the time of
backport this patch is already available from Ocata onwards and has
been verified as sufficiently minimal and safe for backport to Newton
and Mitaka. Essentially the error case is a result of an incorrectly
intialised router update action being executed without proper checks
and this patch fixes this.
[Test Case]
* Deploy Openstack Mitaka - http://pastebin.ubuntu.com/24637244/ -
with neutron-l3-agent configured to provide HA (vrrp) routers.
* Repeatedly create and delete routers in rapid succession and check
that the l3 agent does not go into an infinite error loop i.e. run
http://pastebin.ubuntu.com/24634950/ and run do tail -F
/var/log/neutron/neutron-l3-agent.log on all units of l3 agent. Also
check that qrouter- namepspaces are not stacking up. For Mitaka I
typically hit the error after ~20 create/deletes.
[Regression Potential]
* I do not envisage any regression potential from this patch.
====
When HA router initialize() function fails for some reason(rabbitmq
restart or no ha_port), keepalived_manager or KeepalivedInstance won't
be configured. In this case, _process_router_if_compatible fails with
exception, then _resync_router(update) will again try to process this
router in loop. As we try initialize() only once(which was failed),
retry of _process_router_if_compatible will always fail(no keepalived
manager or instance) and router is never configured(see below trace).
2017-02-06 18:34:18.539 26120 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', 'qr-e6
3406e1-e7'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:101
2017-02-06 18:34:18.544 26120 DEBUG neutron.agent.linux.utils [-]
Command: ['ip', 'netns', 'exec', u'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', u'qr-e63406e1-e7']
Exit code: 0
execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:156
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get_process'
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info Traceback (most recent call last):
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info return func(*args, **kwargs)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info self._process_internal_ports(agent.pd)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info self.internal_network_added(p)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info self._disable_ipv6_addressing_on_interface(interface_name)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info if manager.get_process().active:
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info AttributeError: 'NoneType' object has no attribute 'get_process'
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '114a72fe-02ae-4b87-a2e7-70f962df0951'
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 506, in _process_router_update
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 445, in _process_router_if_compatible
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self._process_updated_router(router)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 459, in _process_updated_router
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent ri.process(self)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 377, in process
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent super(HaRouter, self).process(agent)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 362, in call
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self.logger(e)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent return func(*args, **kwargs)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self._process_internal_ports(agent.pd)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self.internal_network_added(p)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent self._disable_ipv6_addressing_on_interface(interface_name)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent if manager.get_process().active:
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_process'
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1662804/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list