[Bug 2017494] Re: "nova.exception.PortBindingFailed: Binding failed" for OpenStack Zed in Juju deployment
Linux
2017494 at bugs.launchpad.net
Wed Jun 14 15:29:56 UTC 2023
@Alex I restarted and rebooted all 3 ovn-central units, I still see same
error message.I need to check how to upgrade ovn-chassis 22.03.0
version to match ovn-central 22.09
...skipping...
● ovn-ovsdb-server-sb.service - Open vSwitch database server for OVN Southbound database
Loaded: loaded (/lib/systemd/system/ovn-ovsdb-server-sb.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2023-06-11 22:29:04 UTC; 2 days ago
Main PID: 35001 (ovsdb-server)
Tasks: 3 (limit: 314572)
Memory: 7.2M
CPU: 5min 32.259s
CGroup: /system.slice/ovn-ovsdb-server-sb.service
└─35001 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=>
Jun 14 14:56:53 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02444|reconnect|INFO|ssl:10.69.212.15:6644: connecting...
Jun 14 14:56:54 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02445|reconnect|INFO|ssl:10.69.212.15:6644: connection attempt timed out
Jun 14 14:56:54 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02446|reconnect|INFO|ssl:10.69.212.15:6644: waiting 2 seconds before reconnect
Jun 14 14:56:55 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02447|raft|INFO|ssl:10.69.212.15:39008: learned server ID 6e42
Jun 14 14:56:55 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02448|raft|INFO|ssl:10.69.212.15:39008: learned remote address ssl:10.69.212.15:6644
Jun 14 14:56:56 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02449|reconnect|INFO|ssl:10.69.212.15:6644: connecting...
Jun 14 14:56:56 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02450|reconnect|INFO|ssl:10.69.212.15:6644: connected
Jun 14 14:57:43 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02451|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
Jun 14 14:57:43 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02452|jsonrpc|WARN|ssl:127.0.0.1:46464: receive error: Protocol error
Jun 14 14:57:43 juju-24ab8f-0-lxd-1 ovsdb-server[35001]: ovs|02453|reconnect|WARN|ssl:127.0.0.1:46464: connection dropped (Protocol error)
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/2017494
Title:
"nova.exception.PortBindingFailed: Binding failed" for OpenStack Zed
in Juju deployment
Status in charm-ovn-central:
New
Status in neutron package in Ubuntu:
New
Status in ovn package in Ubuntu:
New
Bug description:
I made an OpenStack deployment with Juju, as documented in the Charm
deployment guide (https://docs.openstack.org/project-deploy-
guide/charm-deployment-guide/latest/install-juju.html). The setup
consists of 8 nodes. The deployment itself is successful, Dashboard,
Glance, etc. are running. But when trying to instantiate a VM, the
deployment fails with "nova.exception.PortBindingFailed: Binding
failed <UUUID>, please check neutron logs for more information." (in
Nova log, i.e. /var/log/nova/nova-compute.log).
There is the hint to check "neutron logs", but there is actually no
useful information there.
So, I checked the configuration first:
neutron.yaml for deployment of "neutron-api" and "ovn-chassis":
ovn-chassis:
debug: true
bridge-interface-mappings: >-
br-simulamet:<MAC_NODE_1>
br-simulamet:<MAC_NODE_2>
br-simulamet:...
...
ovn-bridge-mappings: physnet2:br-simulamet
neutron-api:
verbose: true
enable-ml2-port-security: true
neutron-security-groups: true
enable-vlan-trunking: false
vlan-ranges: physnet2
flat-network-providers:
This looks okay. The network interfaces are mapped into the bridge "br-simulamet", it it is actually existing on all nodes, e.g.:
root at P52S11:/var/log# ovs-vsctl get open . external_ids:ovn-bridge-mappings
"physnet2:br-simulamet"
The network/subnet configuration in OpenStack should also be okay, e.g.:
network create smil-network4 --external --provider-network-type vlan --provider-physical-network physnet2 --provider-segment 0204 --share
subnet create smil-network4-ipv4 --network smil-network4 --ip-version 4 --description "VLAN0204-SMIL-Network4" --subnet-range 10.193.4.0/24 --no-dhcp --allocation-pool start=10.193.4.200,end=10.193.4.254
So, the network should correctly map to "physnet2", with a VLAN tag
(here: 204).
(For debugging, I also tried to use the network interface as "flat
network" without VLANs. This does not change anything.)
The deployment (from "juju status") also looks okay for Neutron and OVN:
...
neutron-api 21.0.0 active 1 neutron-api zed/stable 546 no Unit is ready
neutron-api-mysql-router 8.0.32 active 1 mysql-router 8.0/stable 35 no Unit is ready
neutron-api-plugin-ovn 21.0.0 active 1 neutron-api-plugin-ovn zed/stable 45 no Unit is ready
nova-cloud-controller 26.1.0 active 1 nova-cloud-controller zed/stable 633 no Unit is ready
...
ovn-central 22.09.0 active 3 ovn-central 22.09/stable 75 no Unit is ready (leader: ovnsb_db)
ovn-chassis 22.09.1 active 8 ovn-chassis 22.09/stable 109 no Unit is ready
...
/var/log/ovn/ovn-controller.log does not provide useful information about the port binding failure, even after enabling "debug = true" in /etc/neutron/ovn.ini and restarting the services. Also, increasing the OVN log level did not reveal more information here, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
Increasing the Open vSwitch log level also did not reveal more insight, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-rate-limit
So, maybe the issue is related to some component around OVN? One strange thing I noticed: There are two processes "ovsdb-server" running, each with a "--log-file" parameter, referring to /var/log/ovn/ovn-northd.log, /var/log/ovn/ovsdb-server-sb.log:
root at P52S11:/var/log/openvswitch# ps ax | grep ovn
129278 ? Ssl 4:58 ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db=ssl:172.31.255.116:6641,ssl:172.31.255.115:6641,ssl:172.31.255.114:6641 --ovnsb-db=ssl:172.31.255.116:16642,ssl:172.31.255.115:16642,ssl:172.31.255.114:16642 -c /etc/ovn/cert_host -C /etc/ovn/ovn-central.crt -p /etc/ovn/key_host --no-chdir --log-file=/var/log/ovn/ovn-northd.log --pidfile=/var/run/ovn/ovn-northd.pid --detach
130048 ? Ssl 34:42 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db
130251 ? Ssl 47:13 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /var/lib/ovn/ovnsb_db.db
The logs are in containers, checking them:
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-sb.log:
...
2023-04-24T08:30:50.962Z|21781|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.962Z|21782|jsonrpc|WARN|ssl:127.0.0.1:35594: receive error: Protocol error
2023-04-24T08:30:50.962Z|21783|reconnect|WARN|ssl:127.0.0.1:35594: connection dropped (Protocol error)
2023-04-24T08:35:00.857Z|21784|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.857Z|21785|jsonrpc|WARN|ssl:127.0.0.1:34324: receive error: Protocol error
2023-04-24T08:35:00.857Z|21786|reconnect|WARN|ssl:127.0.0.1:34324: connection dropped (Protocol error)
/var/snap/lxd/common/lxd/storage-pools/default/containers/juju-3083dc-2-lxd-1/rootfs/var/log/ovn/ovsdb-server-nb.log:
...
2023-04-24T08:30:50.960Z|22445|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:30:50.960Z|22446|jsonrpc|WARN|ssl:127.0.0.1:48028: receive error: Protocol error
2023-04-24T08:30:50.960Z|22447|reconnect|WARN|ssl:127.0.0.1:48028: connection dropped (Protocol error)
2023-04-24T08:35:00.855Z|22448|stream_ssl|WARN|SSL_accept: error:0A000126:SSL routines::unexpected eof while reading
2023-04-24T08:35:00.855Z|22449|jsonrpc|WARN|ssl:127.0.0.1:52444: receive error: Protocol error
2023-04-24T08:35:00.855Z|22450|reconnect|WARN|ssl:127.0.0.1:52444: connection dropped (Protocol error)
The containers belong to the deployment of "ovn-central", so I assume
something is wrong here.
The issue appears on all 8 nodes I have set up. So, it is reproducible. I can provide log files, etc. on request.
Could this issue be a bug of an OpenStack package (may be ovn-central?), or a problem with the Juju Charms for deployment for OpenStack Zed, or some issue with the setup?
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ovn-central/+bug/2017494/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list