[Bug 1471022] Re: [SRU] race between nova-compute and neutron-ovs-cleanup

Chris J Arges 1471022 at bugs.launchpad.net
Wed Jul 8 16:51:17 UTC 2015


Hello Edward, or anyone else affected,

Accepted nova into vivid-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/nova/1:2015.1.0-0ubuntu1.1 in a few
hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
enable and use -proposed.  Your feedback will aid us getting this update
out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: nova (Ubuntu Vivid)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1471022

Title:
  [SRU] race between nova-compute and neutron-ovs-cleanup

Status in nova package in Ubuntu:
  Fix Released
Status in nova source package in Trusty:
  Fix Committed
Status in nova source package in Utopic:
  Fix Committed
Status in nova source package in Vivid:
  Fix Committed

Bug description:
  [Impact]

  This issue appears to be a consequence of
  https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1420572 where we
  added a 'wait-for-state running' to the nova-compute upstart so as to
  ensure that neutron-ovs-cleanup has finished before nova-compute
  starts.

  I have started to spot, however, that on some hosts (metal only) there
  is now a race between the two whereby nova-compute sometimes fails to
  start on system boot/reboot with the following in /var/log/upstart
  /nova-compute.log:

  ...
  libvirt-bin stop/waiting
  wait-for-state stop/waiting
  neutron-ovs-cleanup start/pre-start, process 3084
  start: Job failed to start

  If I manually restart nova-compute all is fine. So this looks like a
  race between nova-compute's wait-for-state and neutron-ovs-cleanup's
  pre-start -> start/running.

  The proposed solution here is add some retry logic to nova-compute
  upstart job to tolerate neutron-ovs-cleanup not being able to start
  yet. We, therefore, allow a certain number of retries, every other
  with an incremented delay, before giving up and allowing nova-compute
  to start anyway. If ovs-cleanup failed to start after what is a failry
  liberal retry period, it is assumed to have failed altogether thus
  making is safe(ish) to start nova-compute.

  [Test Case]

  In one terminal (as root) do:
  service neutron-ovs-cleanup stop; service openvswitch-switch stop; service nova-compute restart

  In another do:
  sudo tail -F /var/log/upstart/nova-compute.log

  Observe the retries occurring

  Then do 'sudo service openvswitch-switch start' and observe nova-
  compute retry and succeed.

  [Regression Potential]

  If openvswitch-switch does not start within the max retries and
  intervals nova-compute will start anyway and of ovs-cleanup were at
  some point to run one would see the behaviour that LP 1420572 was
  intended to resolve. It does not seem to make sense to wait
  indefinitely for ovs-cleanup to be up and the coded interval is pretty
  liberal and should be plenty enough.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1471022/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list