[Bug 1420572] Re: race between neutron-ovs-cleanup and nova-compute

Edward Hope-Morley edward.hope-morley at canonical.com
Sun Feb 22 17:14:01 UTC 2015


I have tested both the attached Icehouse and Juno patches and can
confirm that they behave as expected i.e.

Installed nova-compute + neutron-plugin-openvswitch-agent (which
installs neutron-ovs-cleanup)

In /var/log/upstart/nova-compute.log I get as expected:

libvirt-bin start/running, process 1409
wait-for-state stop/waiting
neutron-ovs-cleanup stop/waiting
wait-for-state stop/waiting

And if I add a 10 second delay to /usr/bin/neutron-ovs-cleanup I get as
expected:

(time sudo service neutron-ovs-cleanup restart &); time sudo service nova-compute restart
nova-compute stop/waiting
neutron-ovs-cleanup stop/waiting
neutron-ovs-cleanup start/running

real	0m10.460s
user	0m0.010s
sys	0m0.015s
nova-compute start/running, process 3026

real	0m10.468s
user	0m0.010s
sys	0m0.014s

So, nova-compute will now always wait for ovs-cleanup to complete and I
tested that if ovs-cleanup is not installed it gets ignored and nova-
compute starts.

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1420572

Title:
  race between neutron-ovs-cleanup and nova-compute

Status in nova package in Ubuntu:
  In Progress
Status in nova source package in Trusty:
  In Progress
Status in nova source package in Utopic:
  In Progress
Status in nova source package in Vivid:
  In Progress

Bug description:
  [Impact]

   * We run neutron-ovs-cleanup in startup if neutron installed. If
     nova-compute does not wait for completion it will try to use
     veth/bridge devices that may be in the process of bring deleted.

  [Test Case]

   * Create neutron (ovs) network and boot an instance with this network
     as --nic

   * Check that creation was successful and network is functional. Also make
     a note corresponding veth and bridge devices (ip a).

   * Reboot system, check that expected veth and bridge devices are still
     there and that nova-compute is happy e.g. try sshing to your instance.
     Also check /var/log/upstart/nova-compute.log to see if service waited
     for ovs-cleanup to finish.

  [Regression Potential]

   * None

  ---- ---- ---- ----

  There is a race when both neutron-ovs-cleanup and nova-compute trying
  to do operations on the qvb*** and qvo*** devices. Below is a scenario
  I recently met,

  1. nova-compute was started and creating the veth_pair for VM
  instances running on the host -
  https://github.com/openstack/nova/blob/stable/icehouse/nova/network/linux_net.py#L1298

  2. neutron-ovs-cleanup was kicked off and deleted all the ports.

  3. when nova-compute tried to set the MTU at
  https://github.com/openstack/nova/blob/stable/icehouse/nova/network/linux_net.py#L1280
  , Stderr: u'Cannot find device "qvo***"\n' was reported. Because the
  device that was just created was deleted again by neutron-ovs-cleanup.

  As they both operate on the same resources, there needs a way to
  synchronize the operations the two processes do on those resources.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1420572/+subscriptions



More information about the Ubuntu-sponsors mailing list