[Bug 1774249] Please test proposed package

Robie Basak 1774249 at bugs.launchpad.net
Wed Oct 13 14:17:47 UTC 2021


Hello Matthew, or anyone else affected,

Accepted nova into bionic-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/nova/2:17.0.13-0ubuntu4 in a few
hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
bionic to verification-done-bionic. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-bionic. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1774249

Title:
  update_available_resource will raise DiskNotFound after resize but
  before confirm

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive queens series:
  Triaged
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Triaged
Status in OpenStack Compute (nova) pike series:
  Fix Committed
Status in OpenStack Compute (nova) queens series:
  Fix Committed
Status in OpenStack Compute (nova) rocky series:
  Fix Committed
Status in OpenStack Compute (nova) stein series:
  Fix Released
Status in OpenStack Compute (nova) train series:
  Fix Released
Status in nova package in Ubuntu:
  Invalid
Status in nova source package in Bionic:
  Fix Committed

Bug description:
  Original reported in RH Bugzilla:
  https://bugzilla.redhat.com/show_bug.cgi?id=1584315

  Tested on OSP12 (Pike), but appears to be still present on master.
  Should only occur if nova compute is configured to use local file
  instance storage.

  Create instance A on compute X

  Resize instance A to compute Y
    Domain is powered off
    /var/lib/nova/instances/<uuid A> renamed to <uuid A>_resize on X
    Domain is *not* undefined

  On compute X:
    update_available_resource runs as a periodic task
    First action is to update self
    rt calls driver.get_available_resource()
    ...calls _get_disk_over_committed_size_total
    ...iterates over all defined domains, including the ones whose disks we renamed
    ...fails because a referenced disk no longer exists

  Results in errors in nova-compute.log:

      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager [req-bd52371f-c6ec-4a83-9584-c00c5377acd8 - - - - -] Error updating resources for node compute-0.localdomain.: DiskNotFound: No disk at /var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager Traceback (most recent call last):
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6695, in update_available_resource_for_node
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 641, in update_available_resource
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5892, in get_available_resource
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     disk_over_committed = self._get_disk_over_committed_size_total()
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7393, in _get_disk_over_committed_size_total
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     config, block_device_info)
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7301, in _get_instance_disk_info_from_config
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     dk_size = disk_api.get_allocated_disk_size(path)
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/disk/api.py", line 156, in get_allocated_disk_size
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     return images.qemu_img_info(path).disk_size
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 57, in qemu_img_info
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager     raise exception.DiskNotFound(location=path)
      2018-05-30 02:17:08.647 1 ERROR nova.compute.manager DiskNotFound: No disk at /var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk

  And resource tracker is no longer updated. We can find lots of these
  in the gate.

  Note that change Icec2769bf42455853cbe686fb30fda73df791b25 nearly
  mitigates this, but doesn't because task_state is not set while the
  instance is awaiting confirm.

  ================================================================================= 
  [Impact] 

  See above

  
  [Test Plan]

  Deploy Openstack Queens with one compute node.

  Create a VM instance. Eg:
  openstack server create --wait --image $image_name --flavor $flavor --key-name testkey --nic net-id=${net_id} test-instance-1234

  Get the details for that instance and copy the instance_name. Eg:
  openstack server show test-instance-1234 -c OS-EXT-SRV-ATTR:instance_name -f value

  Get the disk location used based on the instance name we retrieved before. Eg:
  disk_location=`juju run -a nova-compute -- virsh domblklist $var_name | grep nova | awk -v N=2 '{print $N}'`

  Move that file in a different location. Eg:
  juju run -a nova-compute -- mv $disk_location "$disk_location"_backup

  Check the nova compute logs on the compute node for a warning. Eg:
  juju run -a nova-compute -- grep "DiskNotFound" /var/log/nova/nova-compute.log

  The output should look like the following:
  ```
  2021-09-22 11:07:46.009 26176 WARNING nova.virt.libvirt.driver [req-6e8eb87e-4024-4908-9b7f-0648ecd03eaf - - - - -] Periodic task is updating the host stats, it is trying to get disk info for instance-00000001, but the backing disk storage was removed by a concurrent operation such as resize. Error: No disk at /var/lib/nova/instances/3bd9578f-e7d7-48bc-bdef-d2d4cb25ea29/disk: DiskNotFound: No disk at /var/lib/nova/instances/3bd9578f-e7d7-48bc-bdef-d2d4cb25ea29/disk
  ```

  [Where problems could occur]

  Users which were relying on an error could be affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1774249/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list