[Bug 2017494] [NEW] "nova.exception.PortBindingFailed: Binding failed" for OpenStack Zed in Juju deployment
Thomas Dreibholz
2017494 at bugs.launchpad.net
Mon Apr 24 08:45:02 UTC 2023
Public bug reported:
I made an OpenStack deployment with Juju, as documented in the Charm
deployment guide (https://docs.openstack.org/project-deploy-guide/charm-
deployment-guide/latest/install-juju.html). The setup consists of 8
nodes. The deployment itself is successful, Dashboard, Glance, etc. are
running. But when trying to instantiate a VM, the deployment fails with
"nova.exception.PortBindingFailed: Binding failed <UUUID>, please check
neutron logs for more information." (in Nova log, i.e.
/var/log/nova/nova-compute.log).
There is the hint to check "neutron logs", but there is actually no
useful information there.
So, I checked the configuration first:
neutron.yaml for deployment of "neutron-api" and "ovn-chassis":
ovn-chassis:
debug: true
bridge-interface-mappings: >-
br-simulamet:<MAC_NODE_1>
br-simulamet:<MAC_NODE_2>
br-simulamet:...
...
ovn-bridge-mappings: physnet2:br-simulamet
neutron-api:
verbose: true
enable-ml2-port-security: true
neutron-security-groups: true
enable-vlan-trunking: false
vlan-ranges: physnet2
flat-network-providers:
This looks okay. The network interfaces are mapped into the bridge "br-simulamet", it it is actually existing on all nodes, e.g.:
root at P52S11:/var/log# ovs-vsctl get open . external_ids:ovn-bridge-mappings
"physnet2:br-simulamet"
The network/subnet configuration in OpenStack should also be okay, e.g.:
network create smil-network4 --external --provider-network-type vlan --provider-physical-network physnet2 --provider-segment 0204 --share
subnet create smil-network4-ipv4 --network smil-network4 --ip-version 4 --description "VLAN0204-SMIL-Network4" --subnet-range 10.193.4.0/24 --no-dhcp --allocation-pool start=10.193.4.200,end=10.193.4.254
So, the network should correctly map to "physnet2", with a VLAN tag
(here: 204).
(For debugging, I also tried to use the network interface as "flat
network" without VLANs. This does not change anything.)
The deployment (from "juju status") also looks okay for Neutron and OVN:
...
neutron-api 21.0.0 active 1 neutron-api zed/stable 546 no Unit is ready
neutron-api-mysql-router 8.0.32 active 1 mysql-router 8.0/stable 35 no Unit is ready
neutron-api-plugin-ovn 21.0.0 active 1 neutron-api-plugin-ovn zed/stable 45 no Unit is ready
nova-cloud-controller 26.1.0 active 1 nova-cloud-controller zed/stable 633 no Unit is ready
...
ovn-central 22.09.0 active 3 ovn-central 22.09/stable 75 no Unit is ready (leader: ovnsb_db)
ovn-chassis 22.09.1 active 8 ovn-chassis 22.09/stable 109 no Unit is ready
...
/var/log/ovn/ovn-controller.log does not provide useful information about the port binding failure, even after enabling "debug = true" in /etc/neutron/ovn.ini and restarting the services. Also, increasing the OVN log level did not reveal more information here, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
Increasing the Open vSwitch log level also did not reveal more insight, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
So, maybe the issue is related to some component around OVN? One strange thing I noticed: There are two processes "ovsdb-server" running, each with a "--log-file" parameter, referring to /var/log/ovn/ovn-northd.log, /var/log/ovn/ovsdb-server-sb.log:
root at P52S11:/var/log/openvswitch# ps ax | grep ovn
129278 ? Ssl 4:58 ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db=ssl:172.31.255.116:6641,ssl:172.31.255.115:6641,ssl:172.31.255.114:6641 --ovnsb-db=ssl:172.31.255.116:16642,ssl:172.31.255.115:16642,ssl:172.31.255.114:16642 -c /etc/ovn/cert_host -C /etc/ovn/ovn-central.crt -p /etc/ovn/key_host --no-chdir --log-file=/var/log/ovn/ovn-northd.log --pidfile=/var/run/ovn/ovn-northd.pid --detach
130048 ? Ssl 34:42 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db
130251 ? Ssl 47:13 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /var/lib/ovn/ovnsb_db.db
The logs are in containers, checking them:
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-sb.log:
...
2023-04-24T08:30:50.962Z|21781|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.962Z|21782|jsonrpc|WARN|ssl:127.0.0.1:35594: receive error: Protocol error
2023-04-24T08:30:50.962Z|21783|reconnect|WARN|ssl:127.0.0.1:35594: connection dropped (Protocol error)
2023-04-24T08:35:00.857Z|21784|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.857Z|21785|jsonrpc|WARN|ssl:127.0.0.1:34324: receive error: Protocol error
2023-04-24T08:35:00.857Z|21786|reconnect|WARN|ssl:127.0.0.1:34324: connection dropped (Protocol error)
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-nb.log:
...
2023-04-24T08:30:50.960Z|22445|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.960Z|22446|jsonrpc|WARN|ssl:127.0.0.1:48028: receive error: Protocol error
2023-04-24T08:30:50.960Z|22447|reconnect|WARN|ssl:127.0.0.1:48028: connection dropped (Protocol error)
2023-04-24T08:35:00.855Z|22448|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.855Z|22449|jsonrpc|WARN|ssl:127.0.0.1:52444: receive error: Protocol error
2023-04-24T08:35:00.855Z|22450|reconnect|WARN|ssl:127.0.0.1:52444: connection dropped (Protocol error)
The containers belong to the deployment of "ovn-central", so I assume
something is wrong here.
The issue appears on all 8 nodes I have set up. So, it is reproducible. I can provide log files, etc. on request.
Could this issue be a bug of an OpenStack package (may be ovn-central?), or a problem with the Juju Charms for deployment for OpenStack Zed, or some issue with the setup?
** Affects: charm-ovn-central
Importance: Undecided
Status: New
** Affects: neutron (Ubuntu)
Importance: Undecided
Status: New
** Affects: ovn (Ubuntu)
Importance: Undecided
Status: New
** Tags: openstack
** Also affects: charm-ovn-central
Importance: Undecided
Status: New
** Tags added: openstack
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/2017494
Title:
"nova.exception.PortBindingFailed: Binding failed" for OpenStack Zed
in Juju deployment
Status in charm-ovn-central:
New
Status in neutron package in Ubuntu:
New
Status in ovn package in Ubuntu:
New
Bug description:
I made an OpenStack deployment with Juju, as documented in the Charm
deployment guide (https://docs.openstack.org/project-deploy-
guide/charm-deployment-guide/latest/install-juju.html). The setup
consists of 8 nodes. The deployment itself is successful, Dashboard,
Glance, etc. are running. But when trying to instantiate a VM, the
deployment fails with "nova.exception.PortBindingFailed: Binding
failed <UUUID>, please check neutron logs for more information." (in
Nova log, i.e. /var/log/nova/nova-compute.log).
There is the hint to check "neutron logs", but there is actually no
useful information there.
So, I checked the configuration first:
neutron.yaml for deployment of "neutron-api" and "ovn-chassis":
ovn-chassis:
debug: true
bridge-interface-mappings: >-
br-simulamet:<MAC_NODE_1>
br-simulamet:<MAC_NODE_2>
br-simulamet:...
...
ovn-bridge-mappings: physnet2:br-simulamet
neutron-api:
verbose: true
enable-ml2-port-security: true
neutron-security-groups: true
enable-vlan-trunking: false
vlan-ranges: physnet2
flat-network-providers:
This looks okay. The network interfaces are mapped into the bridge "br-simulamet", it it is actually existing on all nodes, e.g.:
root at P52S11:/var/log# ovs-vsctl get open . external_ids:ovn-bridge-mappings
"physnet2:br-simulamet"
The network/subnet configuration in OpenStack should also be okay, e.g.:
network create smil-network4 --external --provider-network-type vlan --provider-physical-network physnet2 --provider-segment 0204 --share
subnet create smil-network4-ipv4 --network smil-network4 --ip-version 4 --description "VLAN0204-SMIL-Network4" --subnet-range 10.193.4.0/24 --no-dhcp --allocation-pool start=10.193.4.200,end=10.193.4.254
So, the network should correctly map to "physnet2", with a VLAN tag
(here: 204).
(For debugging, I also tried to use the network interface as "flat
network" without VLANs. This does not change anything.)
The deployment (from "juju status") also looks okay for Neutron and OVN:
...
neutron-api 21.0.0 active 1 neutron-api zed/stable 546 no Unit is ready
neutron-api-mysql-router 8.0.32 active 1 mysql-router 8.0/stable 35 no Unit is ready
neutron-api-plugin-ovn 21.0.0 active 1 neutron-api-plugin-ovn zed/stable 45 no Unit is ready
nova-cloud-controller 26.1.0 active 1 nova-cloud-controller zed/stable 633 no Unit is ready
...
ovn-central 22.09.0 active 3 ovn-central 22.09/stable 75 no Unit is ready (leader: ovnsb_db)
ovn-chassis 22.09.1 active 8 ovn-chassis 22.09/stable 109 no Unit is ready
...
/var/log/ovn/ovn-controller.log does not provide useful information about the port binding failure, even after enabling "debug = true" in /etc/neutron/ovn.ini and restarting the services. Also, increasing the OVN log level did not reveal more information here, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
Increasing the Open vSwitch log level also did not reveal more insight, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
So, maybe the issue is related to some component around OVN? One strange thing I noticed: There are two processes "ovsdb-server" running, each with a "--log-file" parameter, referring to /var/log/ovn/ovn-northd.log, /var/log/ovn/ovsdb-server-sb.log:
root at P52S11:/var/log/openvswitch# ps ax | grep ovn
129278 ? Ssl 4:58 ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db=ssl:172.31.255.116:6641,ssl:172.31.255.115:6641,ssl:172.31.255.114:6641 --ovnsb-db=ssl:172.31.255.116:16642,ssl:172.31.255.115:16642,ssl:172.31.255.114:16642 -c /etc/ovn/cert_host -C /etc/ovn/ovn-central.crt -p /etc/ovn/key_host --no-chdir --log-file=/var/log/ovn/ovn-northd.log --pidfile=/var/run/ovn/ovn-northd.pid --detach
130048 ? Ssl 34:42 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db
130251 ? Ssl 47:13 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /var/lib/ovn/ovnsb_db.db
The logs are in containers, checking them:
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-sb.log:
...
2023-04-24T08:30:50.962Z|21781|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.962Z|21782|jsonrpc|WARN|ssl:127.0.0.1:35594: receive error: Protocol error
2023-04-24T08:30:50.962Z|21783|reconnect|WARN|ssl:127.0.0.1:35594: connection dropped (Protocol error)
2023-04-24T08:35:00.857Z|21784|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.857Z|21785|jsonrpc|WARN|ssl:127.0.0.1:34324: receive error: Protocol error
2023-04-24T08:35:00.857Z|21786|reconnect|WARN|ssl:127.0.0.1:34324: connection dropped (Protocol error)
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-nb.log:
...
2023-04-24T08:30:50.960Z|22445|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.960Z|22446|jsonrpc|WARN|ssl:127.0.0.1:48028: receive error: Protocol error
2023-04-24T08:30:50.960Z|22447|reconnect|WARN|ssl:127.0.0.1:48028: connection dropped (Protocol error)
2023-04-24T08:35:00.855Z|22448|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.855Z|22449|jsonrpc|WARN|ssl:127.0.0.1:52444: receive error: Protocol error
2023-04-24T08:35:00.855Z|22450|reconnect|WARN|ssl:127.0.0.1:52444: connection dropped (Protocol error)
The containers belong to the deployment of "ovn-central", so I assume
something is wrong here.
The issue appears on all 8 nodes I have set up. So, it is reproducible. I can provide log files, etc. on request.
Could this issue be a bug of an OpenStack package (may be ovn-central?), or a problem with the Juju Charms for deployment for OpenStack Zed, or some issue with the setup?
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ovn-central/+bug/2017494/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list