[Bug 1899369] Re: ovn-controller: Disable ofctrl probe by default

Ivan Kolemanov 1899369 at bugs.launchpad.net
Thu Dec 30 09:29:46 UTC 2021


Hi, is the fix "released" for Ubuntu Cloud Archive victoria?

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1899369

Title:
  ovn-controller: Disable ofctrl probe by default

Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
  Fix Released
Status in Ubuntu Cloud Archive victoria series:
  Fix Committed
Status in ovn package in Ubuntu:
  Fix Released
Status in ovn source package in Focal:
  Fix Released
Status in ovn source package in Groovy:
  Fix Released
Status in ovn source package in Hirsute:
  Fix Released

Bug description:
  [Impact]
  Service/host restart or upgrade of the ovn-host package may render a host participating in a OVN network unusable as the ovn-controller process fails to complete programming of the local Open vSwitch switch flows.

  [Test Case]
  The issue was discovered when migrating a 3-node OpenStack cloud with 1000 instances deployed in our test lab. A test case could be to repeat that setup.

  [Regression Potential]
  None, the change of behavior was introduced upstream in [0] and later reversed in [1]. Keeping an idle probe for a unix socket type connection is clearly unnecessary.

  [Original Bug Report]
  A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.

  On a busy system a inactivity probe of 5 seconds is not enough for the
  OVN Controller to complete programming of the switch.

  The change of behavior was corrected in [1] and I think it would be
  beneficial if Ubuntu backported this fix to the OVN package rather
  than having charms and/or end users work around the issue by manually
  configuring the timeout through the `external-ids:ovn-openflow-probe-
  interval` key in the Open_vSwitch table.

  Symptoms of this problem is that a OVN controller is either unable to
  do initial programming of a switch for a host with many ports and
  flows or that updates are lost on a functional system. The following
  will be printed in the log:

  2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-
  int.mgmt: no response to inactivity probe after 5 seconds,
  disconnecting

  0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
  1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1899369/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list