[Bug 1999605] Re: ovsdb: schema conversion for clustered db blocks preventing processing of raft election and inactivity probes
Trent Lloyd
1999605 at bugs.launchpad.net
Thu Dec 15 06:50:17 UTC 2022
For the mentioned workaround, the option to increase from 4s to 30s+
should be the ovsdb-server-election-timer option. It was also observed
that adjusting this option on the fly once already in this situation may
not work correctly and you may need to stop/start the stuck server(s).
** Tags added: sts
** Changed in: openvswitch (Ubuntu)
Status: New => Confirmed
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to openvswitch in Ubuntu.
https://bugs.launchpad.net/bugs/1999605
Title:
ovsdb: schema conversion for clustered db blocks preventing processing
of raft election and inactivity probes
Status in openvswitch package in Ubuntu:
Confirmed
Bug description:
When performing an online schema conversion for a clustered DB the
`ovsdb-client` connects to the current leader of the cluster and
requests it to convert the DB to a new schema.
The main thread of the leader ovsdb-server will then parse the new
schema and copy the entire database into a new in-memory copy using
the new schema. For a moderately sized database, let's say 650MB on-
disk, this process can take north of 24 seconds on a modern adequately
performant system.
While this is happening the ovsdb-server process will not process any
raft election events or inactivity probes, so by the time the
conversion is done and the now past leader wants to write the
converted database to the cluster, its connection to the cluster is
dead.
The past leader will keep repeating this process indefinitely, until
the client requesting the conversion disconnects. No message is passed
to the client.
Meanwhile the other nodes in the cluster have moved on with a new
leader.
A workaround for this scenario would be to increase the election timer
to a value great enough so that the conversion can succeed within an
election window.
I don't view this as a permanent solution though, as it would be
unfair to leave the end user with guessing the correct election timer
in order for their upgrades to succeed.
Maybe we need to hand off conversion to a thread and make the main
loop only process raft requests until it is done, similar to the
recent addition of preparing snapshot JSON in a separate thread [0].
0:
https://github.com/openvswitch/ovs/commit/3cd2cbd684e023682d04dd11d2640b53e4725790
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1999605/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list