[Bug 2029952] Re: [SRU] backport msgpack max_buffer_size patch to Focal
Edward Hope-Morley
2029952 at bugs.launchpad.net
Sat Aug 5 11:47:13 UTC 2023
for context it would appear that the main culprit for triggering this is
when neutron calls to [1] on hosts that have large numbers of devices.
In ussuri this is from the following places:
$ egrep -r "\.get_devices(|_info)\(" neutron/{agent,plugins}
neutron/agent/windows/ip_lib.py: for device in self.get_devices():
neutron/agent/linux/dhcp.py: for d in ns_ip.get_devices():
neutron/agent/linux/ip_lib.py: return not self.get_devices()
neutron/agent/l3/namespaces.py: for d in ns_ip.get_devices():
neutron/agent/l3/dvr_fip_ns.py: devices = ip_wrapper.get_devices()
neutron/agent/l3/dvr_fip_ns.py: for d in ip_wrapper.get_devices():
neutron/agent/l3/dvr_edge_router.py: for d in ns_ip.get_devices():
neutron/agent/l3/dvr_snat_ns.py: for d in ns_ip.get_devices():
neutron/agent/l3/router_info.py: ip_devs = ip_wrapper.get_devices()
neutron/plugins/ml2/drivers/macvtap/agent/macvtap_neutron_agent.py: devices = ip_lib.IPWrapper().get_devices(True)
neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py: devices = self.ip.get_devices(True)
I have personally found that in a l3ha+dvr environment with > 200
routers, neutron hitting this limit when scanning the fip ns devices.
[1]
https://github.com/openstack/neutron/blob/97429dc916c09641f63d69edb3759875a5798e78/neutron/agent/linux/ip_lib.py#L1439
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-oslo.privsep in Ubuntu.
https://bugs.launchpad.net/bugs/2029952
Title:
[SRU] backport msgpack max_buffer_size patch to Focal
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive ussuri series:
New
Status in Ubuntu Cloud Archive victoria series:
New
Status in Ubuntu Cloud Archive wallaby series:
New
Status in Ubuntu Cloud Archive xena series:
New
Status in Ubuntu Cloud Archive yoga series:
New
Status in python-oslo.privsep package in Ubuntu:
New
Status in python-oslo.privsep source package in Focal:
New
Status in python-oslo.privsep source package in Jammy:
New
Bug description:
[Impact]
Hosts running Openstack Neutron from Ussuri to Yoga release are impacted by
the fact that their use of oslo.privsep is thwarted by a default buffer size
on python-msgpack such that commands that return strings > 1MB in size cause
privsep to crash and agents to stop working. This patch backports a fix that
increased the buffer size to one that is more appropriate to neutron usage.
[Test Plan]
* deploy Openstack (version corresponding to release of SRU) with l3ha
* need at least one compute host and two neutron-gateway hosts
* create a large number of routers each with several networks attached
* restart neutron-l3-agent
* wait for restart to complete then do 'journalctl --unit neutron-l3-agent --grep ValueError'
* if the error does not appear then the patch is working
* note that it is necessary to create routers with enough networks and ports to trigger the > 1MB size limit for the netlink info message returned by privsep to trigger
[Where problems could occur]
No regressions are expected to occur as a result of this patch as it is
increasing the buffer size by a fairly small amount to allow neutron
function correctly in loaded environments.
--------------------------------------------------------------------------------
The main explanation for this backport can be found in
https://bugs.launchpad.net/ubuntu/+source/python-
oslo.privsep/+bug/1896734/comments/37 but I'm opening a new bug for
the privsep backport since 1896734 was used to backport a neutron fix.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2029952/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list