[Bug 1929832] Re: stable/ussuri py38 support for keepalived-state-change monitor
Corey Bryant
1929832 at bugs.launchpad.net
Thu May 27 20:22:56 UTC 2021
Uploaded to focal unapproved queue.
** Description changed:
- The victoria release of Openstack received patch [1] which allows the
- neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change
- monitor when running under py38. This patch is needed in Ussuri for
- users running with py38 so we need to backport it.
+ [Impact]
+ [Test Case]
+ The victoria release of Openstack received patch [1] which allows the neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change monitor when running under py38. This patch is needed in Ussuri for users running with py38 so we need to backport it.
The consequence of not having this is that you get the following when
you delete or disable a router:
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent [req-8c69af29-8f9c-4721-9cba-81ff4e9be92c - 9320f5ac55a04fb280d9ceb0b1106a6e - - -] Error while deleting router ab63ccd8-1197-48d0-815e-31adc40e5193: neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 512, in _safe_router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self._router_removed(ri, router_id)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 548, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.router_info[router_id] = ri
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.force_reraise()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise value
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 545, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent ri.delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/dvr_edge_router.py", line 236, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent super(DvrEdgeRouter, self).delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 492, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.destroy_state_change_monitor(self.process_monitor)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 438, in destroy_state_change_monitor
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent pm.disable(sig=str(int(signal.SIGTERM)))
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/external_process.py", line 113, in disable
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent utils.execute(cmd, run_as_root=self.run_as_root)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py", line 147, in execute
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise exceptions.ProcessExecutionError(msg,
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
Which results in the router being deleted from neutron but not the node.
In my case i had both a qrouter and snat ns left with IPs still
configured as well as my fip ip rule allocation still present in
/var/lib/neutron/fip-priorities
[1]
https://github.com/openstack/neutron/commit/4fb505891ee32ae41247f1d7a48b7455b342840e
+
+ [Regression Potential]
+ This change is backported from the stable/victoria release to authorize cleaning up of keepalived-state-chane via rootwrap [1] when running under python3.8. Where things can go wrong with l3.filters would be in the form of filter mistakes that allow or disallow running the intended command. In this case the code is picked straight from what is in stable/victoria and above and has already been tested by Ed, so it appears to have very low regression potential.
+ [1] https://wiki.openstack.org/wiki/Rootwrap
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1929832
Title:
stable/ussuri py38 support for keepalived-state-change monitor
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive ussuri series:
In Progress
Status in neutron:
In Progress
Status in neutron package in Ubuntu:
Invalid
Status in neutron source package in Focal:
Triaged
Bug description:
[Impact]
[Test Case]
The victoria release of Openstack received patch [1] which allows the neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change monitor when running under py38. This patch is needed in Ussuri for users running with py38 so we need to backport it.
The consequence of not having this is that you get the following when
you delete or disable a router:
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent [req-8c69af29-8f9c-4721-9cba-81ff4e9be92c - 9320f5ac55a04fb280d9ceb0b1106a6e - - -] Error while deleting router ab63ccd8-1197-48d0-815e-31adc40e5193: neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 512, in _safe_router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self._router_removed(ri, router_id)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 548, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.router_info[router_id] = ri
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.force_reraise()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise value
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 545, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent ri.delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/dvr_edge_router.py", line 236, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent super(DvrEdgeRouter, self).delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 492, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.destroy_state_change_monitor(self.process_monitor)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 438, in destroy_state_change_monitor
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent pm.disable(sig=str(int(signal.SIGTERM)))
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/external_process.py", line 113, in disable
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent utils.execute(cmd, run_as_root=self.run_as_root)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py", line 147, in execute
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise exceptions.ProcessExecutionError(msg,
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
Which results in the router being deleted from neutron but not the
node. In my case i had both a qrouter and snat ns left with IPs still
configured as well as my fip ip rule allocation still present in
/var/lib/neutron/fip-priorities
[1]
https://github.com/openstack/neutron/commit/4fb505891ee32ae41247f1d7a48b7455b342840e
[Regression Potential]
This change is backported from the stable/victoria release to authorize cleaning up of keepalived-state-chane via rootwrap [1] when running under python3.8. Where things can go wrong with l3.filters would be in the form of filter mistakes that allow or disallow running the intended command. In this case the code is picked straight from what is in stable/victoria and above and has already been tested by Ed, so it appears to have very low regression potential.
[1] https://wiki.openstack.org/wiki/Rootwrap
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1929832/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list