[Bug 1739253] Re: Removing nova-compute unit with scheduled but stopped VM breaks hypervisor-list api call
Corey Bryant
corey.bryant at canonical.com
Thu Jan 11 19:46:16 UTC 2018
This appears to be fixed upstream as of Newton via bug
https://bugs.launchpad.net/nova/+bug/1646255 for which the following
patch was provided:
commit f0d44c5b09f3f3c84038d40b621bb629a1f8110e
Author: Matt Riedemann <mriedem at us.ibm.com>
Date: Sun Dec 4 15:08:04 2016 -0500
Handle ComputeHostNotFound when listing hypervisors
Compute node resources must currently be deleted manually
in the database, and as such they can reference service
records which have been deleted via the services delete API.
Because of this when listing hypervisors (compute nodes), we
may get a ComputeHostNotFound error when trying to lookup a
service record for a compute node where the service was
deleted. This causes the API to fail with a 500 since it's not
handled.
This change handles the ComputeHostNotFound when looping over
compute nodes in the hypervisors index and detail methods and
simply ignores them.
Change-Id: I2717274bb1bd370870acbf58c03dc59cee30cc5e
Closes-Bug: #1646255
** Changed in: nova (Ubuntu)
Status: New => Triaged
** Changed in: nova (Ubuntu)
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1739253
Title:
Removing nova-compute unit with scheduled but stopped VM breaks
hypervisor-list api call
Status in OpenStack nova-cloud-controller charm:
Invalid
Status in OpenStack nova-compute charm:
Invalid
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive mitaka series:
Triaged
Status in nova package in Ubuntu:
Fix Released
Status in nova source package in Xenial:
Triaged
Bug description:
After removing nova-compute unit on node mycloud-cs-003, nova
hypervisor-list stopped working and started to return below error to
user and nova-api-os-compute.log on the nova-api server.
While this may be an upstream issue, I believe the charm should
probably handle this edge case.
When querying the database, I find that the service and the
compute_node entry for the host are both in deleted status, but I see
that there is a scheduled vm on the node mycloud-cs-003. I went in
and did a nova delete <instanceid> on the instance that was scheduled
on that node, and that succeeded, but the "running_vms" total in
compute_nodes table did not decrease, so I updated that row to
running_vms = 0, and I'm still experiencing the below traceback in
nova-api-os-compute.log.
2017-12-19 17:59:35.733 218705 DEBUG nova.api.openstack.wsgi [req-a597c96f-a372-4a0c-9c79-d2859f1612db 2ac8326863b64ea3ba9ba96a7ab70214 51419d6b9c8f475db199c24b0e50a99d - - -] Calling method '<bound method HypervisorsController.index of <nova.api.openstack.compute.hypervisors.HypervisorsController object at 0x7fc679ba5d10>>' _process_stack /usr/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py:699
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions [req-a597c96f-a372-4a0c-9c79-d2859f1612db 2ac8326863b64ea3ba9ba96a7ab70214 51419d6b9c8f475db199c24b0e50a99d - - -] Unexpected exception in API method
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions Traceback (most recent call last):
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ File \"/usr/lib/python2.7/dist-packages/nova/api/openstack/extensions.py\", line 478, in wrapped
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ return f(*args, **kwargs)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ File \"/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/hypervisors.py\", line 88, in index
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ for hyp in compute_nodes])
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ File \"/usr/lib/python2.7/dist-packages/nova/compute/api.py\", line 3743, in service_get_by_compute_host
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ return objects.Service.get_by_compute_host(context, host_name)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions File \"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py\", line 181, in wrapper
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ result = fn(cls, context, *args, **kwargs)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions File \"/usr/lib/python2.7/dist-packages/nova/objects/service.py\", line 243, in get_by_compute_host
--
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ File \"/usr/lib/python2.7/dist-packages/nova/objects/service.py\", line 238, in _db_service_get_by_compute_host
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ return db.service_get_by_compute_host(context, host)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions File \"/usr/lib/python2.7/dist-packages/nova/db/api.py\", line 163, in service_get_by_compute_host
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions return IMPL.service_get_by_compute_host(context, host)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions File \"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py\", line 330, in wrapped
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions return f(context, *args, **kwargs)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ File \"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py\", line 585, in service_get_by_compute_host
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions \ raise exception.ComputeHostNotFound(host=host)
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions ComputeHostNotFound: Compute host mycloud-cs-003 could not be found.
2017-12-19 17:59:36.044 218705 ERROR nova.api.openstack.extensions
2017-12-19 17:59:36.046 218705 INFO nova.api.openstack.wsgi [req-a597c96f-a372-4a0c-9c79-d2859f1612db 2ac8326863b64ea3ba9ba96a7ab70214 51419d6b9c8f475db199c24b0e50a99d - - -] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.ComputeHostNotFound'>
Steps to recreate:
1. deploy nova-cloud-controller and nova-compute with proper relations to keystone/mysql/etc
2. deploy a vm to the nova-compute environment
3. stop the instance
4. juju remove-unit <nova-compute/X> for the unit that the VM was scheduled on
5. nova hypervisor-list should exhibit this error.
Please let me know if this does not work.
Notes: this environment was previously upgraded from either icehouse
or liberty to mitaka. (guessing liberty since the service deleted and
compute_node deleted columns are ordered, incrementing numbers, and
not just 0 or 1)
Running openstack 17.02 charms, I believe on trusty/mitaka cloud.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1739253/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list