[Bug 2029952] Re: [SRU] backport msgpack max_buffer_size patch to Focal

Edward Hope-Morley 2029952 at bugs.launchpad.net
Sat Aug 5 11:47:13 UTC 2023


for context it would appear that the main culprit for triggering this is
when neutron calls to [1] on hosts that have large numbers of devices.

In ussuri this is from the following places:

$ egrep -r "\.get_devices(|_info)\(" neutron/{agent,plugins}
neutron/agent/windows/ip_lib.py:        for device in self.get_devices():
neutron/agent/linux/dhcp.py:        for d in ns_ip.get_devices():
neutron/agent/linux/ip_lib.py:        return not self.get_devices()
neutron/agent/l3/namespaces.py:        for d in ns_ip.get_devices():
neutron/agent/l3/dvr_fip_ns.py:        devices = ip_wrapper.get_devices()
neutron/agent/l3/dvr_fip_ns.py:        for d in ip_wrapper.get_devices():
neutron/agent/l3/dvr_edge_router.py:        for d in ns_ip.get_devices():
neutron/agent/l3/dvr_snat_ns.py:        for d in ns_ip.get_devices():
neutron/agent/l3/router_info.py:        ip_devs = ip_wrapper.get_devices()
neutron/plugins/ml2/drivers/macvtap/agent/macvtap_neutron_agent.py:        devices = ip_lib.IPWrapper().get_devices(True)
neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:            devices = self.ip.get_devices(True)

I have personally found that in a l3ha+dvr environment with > 200
routers, neutron hitting this limit when scanning the fip ns devices.

[1]
https://github.com/openstack/neutron/blob/97429dc916c09641f63d69edb3759875a5798e78/neutron/agent/linux/ip_lib.py#L1439

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-oslo.privsep in Ubuntu.
https://bugs.launchpad.net/bugs/2029952

Title:
  [SRU] backport msgpack max_buffer_size patch to Focal

Status in Ubuntu Cloud Archive:
  New
Status in Ubuntu Cloud Archive ussuri series:
  New
Status in Ubuntu Cloud Archive victoria series:
  New
Status in Ubuntu Cloud Archive wallaby series:
  New
Status in Ubuntu Cloud Archive xena series:
  New
Status in Ubuntu Cloud Archive yoga series:
  New
Status in python-oslo.privsep package in Ubuntu:
  New
Status in python-oslo.privsep source package in Focal:
  New
Status in python-oslo.privsep source package in Jammy:
  New

Bug description:
  [Impact]
  Hosts running Openstack Neutron from Ussuri to Yoga release are impacted by
  the fact that their use of oslo.privsep is thwarted by a default buffer size
  on python-msgpack such that commands that return strings > 1MB in size cause
  privsep to crash and agents to stop working. This patch backports a fix that
  increased the buffer size to one that is more appropriate to neutron usage.

  [Test Plan]

    * deploy Openstack (version corresponding to release of SRU) with l3ha
    * need at least one compute host and two neutron-gateway hosts
    * create a large number of routers each with several networks attached
    * restart neutron-l3-agent
    * wait for restart to complete then do 'journalctl --unit neutron-l3-agent --grep ValueError'
    * if the error does not appear then the patch is working
    * note that it is necessary to create routers with enough networks and ports to trigger the > 1MB size limit for the netlink info message returned by privsep to trigger

  [Where problems could occur]
  No regressions are expected to occur as a result of this patch as it is
  increasing the buffer size by a fairly small amount to allow neutron
  function correctly in loaded environments.

  --------------------------------------------------------------------------------

  The main explanation for this backport can be found in
  https://bugs.launchpad.net/ubuntu/+source/python-
  oslo.privsep/+bug/1896734/comments/37 but I'm opening a new bug for
  the privsep backport since 1896734 was used to backport a neutron fix.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2029952/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list