[Bug 1963698] Re: ovn-controller on Wallaby creates high CPU usage after moving port

Nobuto Murata 1963698 at bugs.launchpad.net
Mon Mar 7 06:51:29 UTC 2022


In this specific case (the environment Olivier described), we tested
focal-xena and the issue was NOT reproducible. We've decided to go with
Xena so field-high can be dropped (I'm not able to remove the
subscription by myself here).

Assuming that it might be focal-wallaby specific since we haven't seen this kind of issues in other customers with ussuri, there may be some patches which needs to be backported. e.g. other distribution seems to have backported the following:
https://github.com/ovn-org/ovn/commit/c83294970c62f662015a7979b12250580bee3001
(no idea if it's connected to the issue or not though)

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1963698

Title:
  ovn-controller on Wallaby creates high CPU usage after moving port

Status in ovn package in Ubuntu:
  New

Bug description:
  We are deploying Focal Wallaby for a customer
  Neutron package version (2:18.2.0-0ubuntu1~cloud0), GLIBC 2.31-0ubuntu9.7

  When running rally/tempest tests that are creating some VMs, the following symptoms happen:
  1) A huge increase of size and load of writings on /var/lib/openvswitch/conf.db
  (If ovsdb-server is restarted while OVS database is a few GB, the unit can fail to start)

  2) A very high CPU usage on the following processes :
  * neutron-ovn-metadata-agent
  * nova-compute
  * ovn-controller
  * ovsdb-server

  3) The Nova compute node may face some severe delays and may time-out
  when creating any instance (for Nova or Octavia Amphora) on it.

  A temporary way to solve the issue is to restart ovn-controller service.
  Then it reproduces again after some time on a different hypervisor.

  It has been reproducible so far only on a customer deployment with
  many Nova-compute units.

  Ovn-controller.log on the hypervisor:
  2022-03-04T12:54:43.065Z|00479|binding|INFO|Changing chassis for lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from comp04.maas to comp18.maas
  .
  2022-03-04T12:54:43.065Z|00480|binding|INFO|cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53: Claiming fa:16:3e:15:1f:a6 10.218.131.106/18
  2022-03-04T12:54:43.077Z|00481|binding|INFO|Releasing lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from this chassis.
  2022-03-04T12:54:46.798Z|00482|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
  2022-03-04T12:54:46.799Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
  2022-03-04T12:54:46.799Z|00484|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
  2022-03-04T12:54:46.799Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)

  Full log of ovn-controller available here :
  https://private-fileshare.canonical.com/~alitvinov/random/ovn-controller.txt

  Bundle available as well here :
  https://private-fileshare.canonical.com/~alitvinov/random/bundle-ovn-controller.txt

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1963698/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list