[Bug 1906280] Re: [SRU] Add support for disabling memlockall() calls in ovs-vswitchd
Corey Bryant
1906280 at bugs.launchpad.net
Tue Dec 15 20:14:59 UTC 2020
** Summary changed:
- Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch record'
+ [SRU] Add support for disabling memlockall() calls in ovs-vswitchd
** Description changed:
+ [Impact]
+
+ Original bug title: Charm stuck waiting for ovsdb 'no key "ovn-remote"
+ in Open_vSwitch record'
+
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
** Description changed:
[Impact]
+
+
+ https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
+ https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
Original bug title: Charm stuck waiting for ovsdb 'no key "ovn-remote"
in Open_vSwitch record'
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
** Description changed:
[Impact]
+ Recent changes to systemd rlimit are resulting in memory exhaustion with
+ ovs-vswitchd's use of mlockall(). mlockall() can be disabled via
+ /etc/defaults/openvswitch-vswitch, however there is currently a bug in
+ the shipped ovs-vswitchd systemd unit file and default environment
+ variable file. The package will be fixed in this SRU. Additionally the
+ neutron-openvswitch charm will be updated to enable disabling of
+ mlockall() use in ovs-vswitchd.
+ More details on the above summary can be found in the following comments:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
- Original bug title: Charm stuck waiting for ovsdb 'no key "ovn-remote"
- in Open_vSwitch record'
+ Original bug title:
+
+ Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch
+ record'
+
+ Original bug details:
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
** Description changed:
[Impact]
Recent changes to systemd rlimit are resulting in memory exhaustion with
ovs-vswitchd's use of mlockall(). mlockall() can be disabled via
/etc/defaults/openvswitch-vswitch, however there is currently a bug in
- the shipped ovs-vswitchd systemd unit file and default environment
- variable file. The package will be fixed in this SRU. Additionally the
- neutron-openvswitch charm will be updated to enable disabling of
- mlockall() use in ovs-vswitchd.
+ the shipped ovs-vswitchd systemd unit file that prevents it. The package
+ will be fixed in this SRU. Additionally the neutron-openvswitch charm
+ will be updated to enable disabling of mlockall() use in ovs-vswitchd.
More details on the above summary can be found in the following comments:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
Original bug title:
Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch
record'
Original bug details:
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
** Description changed:
[Impact]
Recent changes to systemd rlimit are resulting in memory exhaustion with
ovs-vswitchd's use of mlockall(). mlockall() can be disabled via
/etc/defaults/openvswitch-vswitch, however there is currently a bug in
the shipped ovs-vswitchd systemd unit file that prevents it. The package
will be fixed in this SRU. Additionally the neutron-openvswitch charm
- will be updated to enable disabling of mlockall() use in ovs-vswitchd.
+ will be updated to enable disabling of mlockall() use in ovs-vswitchd
+ via a config option.
More details on the above summary can be found in the following comments:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
Original bug title:
Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch
record'
Original bug details:
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
** Description changed:
[Impact]
Recent changes to systemd rlimit are resulting in memory exhaustion with
ovs-vswitchd's use of mlockall(). mlockall() can be disabled via
/etc/defaults/openvswitch-vswitch, however there is currently a bug in
the shipped ovs-vswitchd systemd unit file that prevents it. The package
will be fixed in this SRU. Additionally the neutron-openvswitch charm
will be updated to enable disabling of mlockall() use in ovs-vswitchd
via a config option.
More details on the above summary can be found in the following comments:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
+
+ ==== Original bug details ===
Original bug title:
Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch
record'
Original bug details:
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on a
missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
+ ==============================
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to openvswitch in Ubuntu.
https://bugs.launchpad.net/bugs/1906280
Title:
[SRU] Add support for disabling memlockall() calls in ovs-vswitchd
Status in OpenStack neutron-openvswitch charm:
In Progress
Status in charm-ovn-chassis:
In Progress
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive queens series:
Triaged
Status in Ubuntu Cloud Archive stein series:
Triaged
Status in Ubuntu Cloud Archive train series:
Triaged
Status in Ubuntu Cloud Archive ussuri series:
Triaged
Status in openvswitch package in Ubuntu:
Triaged
Status in openvswitch source package in Bionic:
Triaged
Status in openvswitch source package in Focal:
Triaged
Status in openvswitch source package in Groovy:
Triaged
Status in openvswitch source package in Hirsute:
Triaged
Bug description:
[Impact]
Recent changes to systemd rlimit are resulting in memory exhaustion
with ovs-vswitchd's use of mlockall(). mlockall() can be disabled via
/etc/defaults/openvswitch-vswitch, however there is currently a bug in
the shipped ovs-vswitchd systemd unit file that prevents it. The
package will be fixed in this SRU. Additionally the neutron-
openvswitch charm will be updated to enable disabling of mlockall()
use in ovs-vswitchd via a config option.
More details on the above summary can be found in the following comments:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/16
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/comments/19
==== Original bug details ===
Original bug title:
Charm stuck waiting for ovsdb 'no key "ovn-remote" in Open_vSwitch
record'
Original bug details:
As seen during this Focal Ussuri test run: https://solutions.qa.canonical.com/testruns/testRun/5f7ad510-f57e-40ce-beb7-5f39800fa5f0
Crashdump here: https://oil-jenkins.canonical.com/artifacts/5f7ad510-f57e-40ce-beb7-5f39800fa5f0/generated/generated/openstack/juju-crashdump-openstack-2020-11-28-03.40.36.tar.gz
Full history of occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1906280
Octavia's ovn-chassis units are stuck waiting:
octavia/0 blocked idle 1/lxd/8 10.244.8.170 9876/tcp Awaiting leader to create required resources
hacluster-octavia/1 active idle 10.244.8.170 Unit is ready and clustered
logrotated/63 active idle 10.244.8.170 Unit is ready.
octavia-ovn-chassis/1 waiting executing 10.244.8.170 'ovsdb' incomplete
public-policy-routing/45 active idle 10.244.8.170 Unit is ready
When the db is reporting healthy:
ovn-central/0* active idle 1/lxd/9 10.246.64.225 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db, ovnsb_db)
logrotated/19 active idle 10.246.64.225 Unit is ready.
ovn-central/1 active idle 3/lxd/9 10.246.64.250 6641/tcp,6642/tcp Unit is ready (northd: active)
logrotated/27 active idle 10.246.64.250 Unit is ready.
ovn-central/2 active idle 5/lxd/9 10.246.65.21 6641/tcp,6642/tcp Unit is ready
logrotated/52 active idle 10.246.65.21 Unit is ready.
Warning in the juju unit logs indicates that the charm is blocking on
a missing key in the ovsdb:
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb-subordinate/provides.py:97:joined:ovsdb-subordinate
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "relation-get"
2020-11-27 23:36:57 WARNING ovsdb-relation-changed ovs-vsctl: no key "ovn-remote" in Open_vSwitch record "." column external_ids
2020-11-27 23:36:57 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-11-27 23:36:57 INFO juju-log ovsdb:195: Invoking reactive handler: hooks/relations/ovsdb/requires.py:34:joined:ovsdb
==============================
[Test Case]
The easiest way to test this is to deploy openstack with the neutron-openvswitch charm, using the new charm updates. Once deployed, edit /usr/share/openvswitch/scripts/ovs-ctl with an echo to show what MLOCKALL is set to. Then toggle the charm config option [1] and look at journalctl -xe to find the echo output, which should correspond to the mlockall setting.
[1]
juju config neutron-openvswitch disable-mlockall=true
juju config neutron-openvswitch disable-mlockall=false
[Regression Potential]
There's potential that this will break users who have come to depend on the incorrect EnvironmentFile setting and environment variable in the systemd unit file for ovs-vswitchd. If that is the case they must be running with modified systemd unit files anyway so it is probably a moot point.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1906280/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list