[Bug 1899369] Re: ovn-controller: Disable ofctrl probe by default
Corey Bryant
1899369 at bugs.launchpad.net
Thu Jan 7 13:44:24 UTC 2021
This bug was fixed in the package ovn - 20.03.1-0ubuntu1.1~cloud0
---------------
ovn (20.03.1-0ubuntu1.1~cloud0) bionic-ussuri; urgency=medium
.
* New upstream release for the Ubuntu Cloud Archive.
.
ovn (20.03.1-0ubuntu1.1) focal; urgency=medium
.
* d/p/ovn-controller-ofctrl-probe-interval.patch: Cherry pick
fix to disable ofctrl probe by default (LP: #1899369).
.
ovn (20.03.1-0ubuntu1) focal; urgency=medium
.
* New upstream point release (LP: #1897248).
** Changed in: cloud-archive/ussuri
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1899369
Title:
ovn-controller: Disable ofctrl probe by default
Status in Ubuntu Cloud Archive:
Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
Fix Released
Status in Ubuntu Cloud Archive victoria series:
Fix Committed
Status in ovn package in Ubuntu:
Fix Released
Status in ovn source package in Focal:
Fix Released
Status in ovn source package in Groovy:
Fix Released
Status in ovn source package in Hirsute:
Fix Released
Bug description:
[Impact]
Service/host restart or upgrade of the ovn-host package may render a host participating in a OVN network unusable as the ovn-controller process fails to complete programming of the local Open vSwitch switch flows.
[Test Case]
The issue was discovered when migrating a 3-node OpenStack cloud with 1000 instances deployed in our test lab. A test case could be to repeat that setup.
[Regression Potential]
None, the change of behavior was introduced upstream in [0] and later reversed in [1]. Keeping an idle probe for a unix socket type connection is clearly unnecessary.
[Original Bug Report]
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the
OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be
beneficial if Ubuntu backported this fix to the OVN package rather
than having charms and/or end users work around the issue by manually
configuring the timeout through the `external-ids:ovn-openflow-probe-
interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to
do initial programming of a switch for a host with many ports and
flows or that updates are lost on a functional system. The following
will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-
int.mgmt: no response to inactivity probe after 5 seconds,
disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1899369/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list