[Bug 1988457] Re: [SRU] ovsdbapp can time out on raft leadership change

Andreas Hasenack 1988457 at bugs.launchpad.net
Thu Aug 22 21:17:10 UTC 2024


Hello Terry, or anyone else affected,

Accepted python-ovsdbapp into jammy-proposed. The package will build now
and be available at https://launchpad.net/ubuntu/+source/python-
ovsdbapp/1.15.1-0ubuntu2.1 in a few hours, and then in the -proposed
repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
jammy to verification-done-jammy. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-jammy. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: python-ovsdbapp (Ubuntu Jammy)
       Status: New => Fix Committed

** Tags added: verification-needed verification-needed-jammy

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1988457

Title:
  [SRU] ovsdbapp can time out on raft leadership change

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive yoga series:
  New
Status in ovsdbapp:
  Fix Released
Status in python-ovsdbapp package in Ubuntu:
  Fix Released
Status in python-ovsdbapp source package in Jammy:
  Fix Committed

Bug description:
  When raft leadership changes, any leader-only connections will be
  disconnected and will need to reconnect to the new leader. When this
  happens, the IDL will return a txn status of TRY_AGAIN. The current
  code tries to do an exponential backoff with sleep() due to an issue
  where those can be spammed 1000s of times a second. This sleep also
  prevents reconnecting quickly because idl.run() is not called rapidly
  and can lead to timeouts.

  --------------------------------------------------------------------------------
  SRU TEMPLATE:

  [Impact]

  Please see original bug description. What i can add to this is that
  what we saw in production as a consequence of this was that ovsdbapp
  transactions would fail after a timeout and ovsdbapp would then end up
  in a retry sequence such that the transations would not get retried
  and vm tap devices would not get deleted from ovs when a vm was
  deleted. The result was a build up of "stale" tap devices on br-int
  (visible as "No such device" entries in ovs-vsctl show).

  [Test Plan]

  * Deploy OpenStack Jammy (Yoga) with ml2-ovn
  * Spawn several vms
  * Trigger many ovn-central db leadership switches by restarting ovn-central units in rotation leaving enough between each for a new leader to be elected.
  * Delete the vms and create a load more while leaders are being re-elected.
  * First check that /var/log/nova/nova-compute.log does not contain the "OVSDB transaction returned TRY_AGAIN" message over and over then also check that ovs-vsctl show does not contain any "stale" ports with messages like the following:

      Port tapa5d45fc6-02
          Interface tapa5d45fc6-02
              error: "could not open network device tapa5d45fc6-02 (No such device)"

  
  [Regression Potential]
  This patch is not expected to introduce any regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1988457/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list