[Bug 1460164] Re: upgrade of openvswitch-switch can sometimes break neutron-plugin-openvswitch-agent

James Page james.page at ubuntu.com
Fri Dec 18 11:07:46 UTC 2015


On a fresh icehouse install I see the following on a restart of ovs:

2015-12-18 11:05:26.855 6876 DEBUG neutron.agent.linux.async_process [-] Halting async process [['ovsdb-client', 'monitor', 'Interface', 'name,ofport', '--format=json']]. stop /usr/lib/python2.7/dist-packages/neutron/agent/linux/async_process.py:90
2015-12-18 11:05:26.857 6876 CRITICAL neutron [-] Trying to re-send() an already-triggered event.

The neutron-plugin-openvswitch-agent then terminates and gets restarted
by upstart, triggering a full sync of ovs state:


2015-12-18 11:05:27.229 11075 INFO neutron.common.config [-] Logging enabled!
2015-12-18 11:05:27.230 11075 DEBUG neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] ******************************************************************************** log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1928
2015-12-18 11:05:27.230 11075 DEBUG neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Configuration options gathered from: log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1929
2015-12-18 11:05:27.230 11075 DEBUG neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] command line args: ['--config-file=/etc/neutron/neutron.conf', '--config-file=/etc/neutron/plugins/ml2/ml2_conf.ini', '--log-file=/var/log/neutron/openvswitch-agent.log'] log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1930
2015-12-18 11:05:27.230 11075 DEBUG neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] config files: ['/etc/neutron/neutron.conf', '/etc/neutron/plugins/ml2/ml2_conf.ini'] log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1931

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1460164

Title:
  upgrade of openvswitch-switch can sometimes break neutron-plugin-
  openvswitch-agent

Status in neutron package in Ubuntu:
  Triaged

Bug description:
  On 2015-05-28, our Landscape auto-upgraded packages on two of our
  OpenStack clouds.  On both clouds, but only on some compute nodes, the
  upgrade of openvswitch-switch and corresponding downtime of
  ovs-vswitchd appears to have triggered some sort of race condition
  within neutron-plugin-openvswitch-agent leaving it in a broken state;
  any new instances come up with non-functional network but pre-existing
  instances appear unaffected.  Restarting n-p-ovs-agent on the affected
  compute nodes is sufficient to work around the problem.

  The packages Landscape upgraded (from /var/log/apt/history.log):

  Start-Date: 2015-05-28  14:23:07
  Upgrade: nova-compute-libvirt:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-login0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), nova-compute-kvm:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), systemd-services:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-common:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), nova-common:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), python-nova:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-daemon0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), libpam-systemd:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), udev:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub2-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), openvswitch-switch:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2), libudev1:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-client:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), python-eventlet:amd64 (0.13.0-1ubuntu2, 0.13.0-1ubuntu2.1), python-novaclient:amd64 (2.17.0-0ubuntu1.1, 2.17.0-0ubuntu1.2), grub-pc-bin:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), grub-pc:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), nova-compute:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), openvswitch-common:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2)
  End-Date: 2015-05-28  14:24:47

  From /var/log/neutron/openvswitch-agent.log:

  2015-05-28 14:24:18.336 47866 ERROR neutron.agent.linux.ovsdb_monitor
  [-] Error received from ovsdb monitor: ovsdb-client:
  unix:/var/run/openvswitch/db.sock: receive failed (End of file)

  Looking at a stuck instances, all the right tunnels and bridges and
  what not appear to be there:

  root at vector:~# ip l l | grep c-3b
  460002: qbr7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
  460003: qvo7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000
  460004: qvb7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UP mode DEFAULT group default qlen 1000
  460005: tap7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UNKNOWN mode DEFAULT group default qlen 500
  root at vector:~# ovs-vsctl list-ports br-int | grep c-3b
  qvo7ed8b59c-3b
  root at vector:~# 

  But I can't ping the unit from within the qrouter-${id} namespace on
  the neutron gateway.  If I tcpdump the {q,t}*c-3b interfaces, I don't
  see any traffic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1460164/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list