[Bug 1977933] Re: mediated devices missing after reboot

James Page 1977933 at bugs.launchpad.net
Wed Jun 8 07:33:37 UTC 2022


** Also affects: nova (Ubuntu)
   Importance: Undecided
       Status: New

** Summary changed:

- mediated devices missing after reboot
+ nova fails to re-create mediated devices after reboot

** Description changed:

  OpenStack Xena
  Ubuntu 20.04
  
  After a reboot of a nova-compute node with running instances with
  attached vgpu devices the nova-compute daemon fails to startup due to
- missing mediated device definitions:
+ missing mediated device definitions.
+ 
+ It looks like the code intends to detect the missing devices and then
+ re-create them but the libvirt python module throws an exception due to
+ the missing mediated device when the domain definition is being
+ inspected.
  
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service [-] Error starting thread.: libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_9a95927e_f50a_4e34_84fc_3b27508f4241'
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service Traceback (most recent call last):
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/oslo_service/service.py", line 806, in run_service
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     service.start()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/service.py", line 159, in start
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self.manager.init_host()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/compute/manager.py", line 1416, in init_host
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self.driver.init_host(host=self.host)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 800, in init_host
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self._recreate_assigned_mediated_devices()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 980, in _recreate_assigned_mediated_devices
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     dev_info = self._get_mediated_device_information(dev_name)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 7761, in _get_mediated_device_information
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     virtdev = self._host.device_lookup_by_name(devname)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/host.py", line 1216, in device_lookup_by_name
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     return self.get_connection().nodeDeviceLookupByName(name)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 193, in doit
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 151, in proxy_call
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     rv = execute(f, *args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 132, in execute
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     six.reraise(c, e, tb)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     raise value
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 86, in tworker
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     rv = meth(*args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/libvirt.py", line 4612, in nodeDeviceLookupByName
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_9a95927e_f50a_4e34_84fc_3b27508f4241'
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1977933

Title:
  nova fails to re-create mediated devices after reboot

Status in OpenStack Nova Compute NVIDIA vGPU Plugin Charm:
  New
Status in nova package in Ubuntu:
  New

Bug description:
  OpenStack Xena
  Ubuntu 20.04

  After a reboot of a nova-compute node with running instances with
  attached vgpu devices the nova-compute daemon fails to startup due to
  missing mediated device definitions.

  It looks like the code intends to detect the missing devices and then
  re-create them but the libvirt python module throws an exception due
  to the missing mediated device when the domain definition is being
  inspected.

  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service [-] Error starting thread.: libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_9a95927e_f50a_4e34_84fc_3b27508f4241'
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service Traceback (most recent call last):
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/oslo_service/service.py", line 806, in run_service
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     service.start()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/service.py", line 159, in start
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self.manager.init_host()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/compute/manager.py", line 1416, in init_host
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self.driver.init_host(host=self.host)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 800, in init_host
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     self._recreate_assigned_mediated_devices()
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 980, in _recreate_assigned_mediated_devices
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     dev_info = self._get_mediated_device_information(dev_name)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 7761, in _get_mediated_device_information
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     virtdev = self._host.device_lookup_by_name(devname)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/nova/virt/libvirt/host.py", line 1216, in device_lookup_by_name
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     return self.get_connection().nodeDeviceLookupByName(name)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 193, in doit
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 151, in proxy_call
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     rv = execute(f, *args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 132, in execute
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     six.reraise(c, e, tb)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     raise value
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 86, in tworker
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     rv = meth(*args, **kwargs)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service   File "/usr/lib/python3/dist-packages/libvirt.py", line 4612, in nodeDeviceLookupByName
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_9a95927e_f50a_4e34_84fc_3b27508f4241'
  2022-06-08 07:24:27.061 2689 ERROR oslo_service.service

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute-nvidia-vgpu/+bug/1977933/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list