[Bug 1692397] Re: hypervisor statistics could be incorrect

Launchpad Bug Tracker 1692397 at bugs.launchpad.net
Mon Dec 18 18:24:26 UTC 2017


This bug was fixed in the package nova - 2:13.1.4-0ubuntu4.2

---------------
nova (2:13.1.4-0ubuntu4.2) xenial; urgency=medium

  [ Seyeong Kim ]
  * Add supporting http_proxy_to_wsgi to api-paste.ini (LP: #1573766)
    - d/p/0001-Add-http_proxy_to_wsgi-to-api-paste.patch
    - d/p/0002-Add-proxy-middleware-to-application-pipeline.patch

  [ Edward Hope-Morley ]
  * Patch nova.db.sqlalchemy.api.compute_node_statistics() to
    exclude deleted services from stats count. This is the same
    fix as that backported to newton in bug 1692397 except that
    the actual patch is not backportable due to the underlying
    code changing extensively.
    - d/p/exlude-deleted-service-from-stats-count.patch (LP: #1692397)

 -- Corey Bryant <corey.bryant at canonical.com>  Fri, 08 Dec 2017 15:44:43
-0500

** Changed in: nova (Ubuntu Xenial)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1692397

Title:
  hypervisor statistics could be incorrect

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) newton series:
  Fix Committed
Status in OpenStack Compute (nova) ocata series:
  Fix Committed
Status in nova package in Ubuntu:
  Fix Released
Status in nova source package in Xenial:
  Fix Released

Bug description:
  [Impact]

  If you deploy a nova-compute service to a node, delete that service
  (via the api), then deploy a new nova-compute service to that same
  node i.e. same hostname, the database will now have two service
  records one marked as deleted and the other not. So far so good until
  you do an 'openstack hypervisor stats show' at which point the api
  will aggregate the resource counts from both services. This has been
  fixed and backported all the way down to Newton so the problem still
  exists on Mitaka. I assume the reason why the patch was not backported
  to Mitaka is that the code in
  nova.db.sqlalchemy.apy.compute_node_statistics() changed quite a bit.
  However it only requires a one line change in the old code (that does
  the same thing as the new code) to fix this issue.

  [Test Case]

   * Deploy Mitaka with bundle http://pastebin.ubuntu.com/25968008/

   * Do 'openstack hypervisor stats show' and verify that count is 3

   * Do 'juju remove-unit nova-compute/2' to delete a compute service
  but not its physical host

   * Do 'openstack compute service delete <id>' to delete a compute
  service we just removed (choosing correct id)

   * Do 'openstack hypervisor stats show' and verify that count is 2

   * Do juju add-unit nova-compute --to <machine id of deleted unit>

   * Do 'openstack hypervisor stats show' and verify that count is 3
  (not 4 as it would be before fix)

  [Regression Potential]

  None anticipated other than for clients that were interpreting invalid
  counts as correct.

  [Other Info]
   
  ===========================================================================

  Hypervisor statistics could be incorrect:

  When we killed a nova-compute service and deleted the service from nova DB, and then
  start the nova-compute service again, the result of Hypervisor/statistics API (nova hypervisor-stats) will be
  incorrect;

  How to reproduce:

  Step1. Check the correct statistics before we do anything:
  root at SZX1000291919:/opt/stack/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 1     |
  | current_workload     | 0     |
  | disk_available_least | 14    |
  | free_disk_gb         | 34    |
  | free_ram_mb          | 6936  |
  | local_gb             | 35    |
  | local_gb_used        | 1     |
  | memory_mb            | 7960  |
  | memory_mb_used       | 1024  |
  | running_vms          | 1     |
  | vcpus                | 8     |
  | vcpus_used           | 1     |
  +----------------------+-------+

  Step2. Kill the compute service:
  root at SZX1000291919:/var/log/nova# ps -ef | grep nova-com
  root     120419 120411  0 11:06 pts/27   00:00:00 sg libvirtd /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log
  root     120420 120419  0 11:06 pts/27   00:00:07 /usr/bin/python /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log

  root at SZX1000291919:/var/log/nova# kill -9 120419
  root at SZX1000291919:/var/log/nova# /usr/local/bin/stack: line 19: 120419 Killed                  sg libvirtd '/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log' > /dev/null 2>&1

  root at SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:36.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:36.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:37.000000 | -               |
  | 8  | nova-compute     | SZX1000291919 | nova     | enabled | down  | 2017-05-22T03:23:38.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step3. Delete the service from DB:

  root at SZX1000291919:/var/log/nova# nova service-delete 8
  root at SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:16.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:16.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:17.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step4. Start the compute service again:
  root at SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:55.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:56.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:56.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  | 10 | nova-compute     | SZX1000291919 | nova     | enabled | up    | 2017-05-22T03:48:57.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step5. Check again the hyervisor statistics, the result is incorrect:

  root at SZX1000291919:/var/log/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 2     |
  | current_workload     | 0     |
  | disk_available_least | 28    |
  | free_disk_gb         | 68    |
  | free_ram_mb          | 13872 |
  | local_gb             | 70    |
  | local_gb_used        | 2     |
  | memory_mb            | 15920 |
  | memory_mb_used       | 2048  |
  | running_vms          | 2     |
  | vcpus                | 16    |
  | vcpus_used           | 2     |
  +----------------------+-------+

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1692397/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list