[Bug 1420572] Re: race between neutron-ovs-cleanup and nova-compute
Edward Hope-Morley
edward.hope-morley at canonical.com
Sun Feb 22 17:14:01 UTC 2015
I have tested both the attached Icehouse and Juno patches and can
confirm that they behave as expected i.e.
Installed nova-compute + neutron-plugin-openvswitch-agent (which
installs neutron-ovs-cleanup)
In /var/log/upstart/nova-compute.log I get as expected:
libvirt-bin start/running, process 1409
wait-for-state stop/waiting
neutron-ovs-cleanup stop/waiting
wait-for-state stop/waiting
And if I add a 10 second delay to /usr/bin/neutron-ovs-cleanup I get as
expected:
(time sudo service neutron-ovs-cleanup restart &); time sudo service nova-compute restart
nova-compute stop/waiting
neutron-ovs-cleanup stop/waiting
neutron-ovs-cleanup start/running
real 0m10.460s
user 0m0.010s
sys 0m0.015s
nova-compute start/running, process 3026
real 0m10.468s
user 0m0.010s
sys 0m0.014s
So, nova-compute will now always wait for ovs-cleanup to complete and I
tested that if ovs-cleanup is not installed it gets ignored and nova-
compute starts.
--
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1420572
Title:
race between neutron-ovs-cleanup and nova-compute
Status in nova package in Ubuntu:
In Progress
Status in nova source package in Trusty:
In Progress
Status in nova source package in Utopic:
In Progress
Status in nova source package in Vivid:
In Progress
Bug description:
[Impact]
* We run neutron-ovs-cleanup in startup if neutron installed. If
nova-compute does not wait for completion it will try to use
veth/bridge devices that may be in the process of bring deleted.
[Test Case]
* Create neutron (ovs) network and boot an instance with this network
as --nic
* Check that creation was successful and network is functional. Also make
a note corresponding veth and bridge devices (ip a).
* Reboot system, check that expected veth and bridge devices are still
there and that nova-compute is happy e.g. try sshing to your instance.
Also check /var/log/upstart/nova-compute.log to see if service waited
for ovs-cleanup to finish.
[Regression Potential]
* None
---- ---- ---- ----
There is a race when both neutron-ovs-cleanup and nova-compute trying
to do operations on the qvb*** and qvo*** devices. Below is a scenario
I recently met,
1. nova-compute was started and creating the veth_pair for VM
instances running on the host -
https://github.com/openstack/nova/blob/stable/icehouse/nova/network/linux_net.py#L1298
2. neutron-ovs-cleanup was kicked off and deleted all the ports.
3. when nova-compute tried to set the MTU at
https://github.com/openstack/nova/blob/stable/icehouse/nova/network/linux_net.py#L1280
, Stderr: u'Cannot find device "qvo***"\n' was reported. Because the
device that was just created was deleted again by neutron-ovs-cleanup.
As they both operate on the same resources, there needs a way to
synchronize the operations the two processes do on those resources.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1420572/+subscriptions
More information about the Ubuntu-sponsors
mailing list