[SRU][Lunar][PATCH v2 0/1] UBUNTU: SAUCE: Add mdev_set_iommu_device() kABI.

Dominik Csapak d.csapak at proxmox.com
Wed Jun 14 11:41:34 UTC 2023


Hi, short question about this

On 5/18/23 13:18, Tarun Gupta wrote:
> BugLink : https://bugs.launchpad.net/bugs/1988806
> 
> SRU Justification:
> 
> [Impact]
> 
> Currently, with below commit present in 5.16 upstream kernel,
> mdev_set_iommu_device() kABI is removed.
> 
>   fda49d97f2c4 ("vfio: remove the unused mdev iommu hook")
> 
> This results in SRIOV based Nvidia vGPU being broken with kernels that
> have the above upstream commit present.
> So, with Ubuntu 22.04 HWE kernel update (i.e the 6.2.x Lunar kernel),
> SRIOV based Nvidia vGPU is broken.
> 
> Earlier, during 5.19.x HWE kernel in Kinetic release, a similar patch
> was accepted. Refer
> https://lists.ubuntu.com/archives/kernel-team/2022-September/133142.html
> But, this patch didn't get carry-forward from Kinetic to Lunar because
> of upstream merge conflict and had to be revert.
> 
> [Fix]
> 
> On 6.2.x HWE kernel, we revert the above patch which removed the
> support for mdev_set_iommu_device() kABI so that vGPU works fine.
> 
> Separately, to fix this in upstream kernels, vGPU is planning to adopt
> vfio-pci-core framework instead of using MDEV framework.
> Currently, vfio-pci-core framework works with SRIOV vGPU but lacks
> libvirt
> support to assign VFs using vfio-pci-core framework.
> Will work with upstream libvirt community to get libvirt support for
> vfio-pci-core devices. Post that, we don't need this custom mdev patch
> and vGPU can work out-of-box on Ubuntu with vfio-pci-core frameowrk


AFAIU these patches should fix vGPU usage on kernel 6.2 ?

We (Proxmox) use an Ubuntu based Kernel for our distribution, and with 6.2 (even with these
patches) we were unable to install the Linux KVM vGPU driver (15.2) from the nvidia site[0]
because of compilation issues. This would indicate that more changed in the kernel than just
what these patches do revert/fix. (I also tried the ubuntu package from there, but it did
not make a difference)

Is there some other way to get vgpu/mdev working besides using the proprietary nvidia driver ?

For our next release (currently in beta) we'll be using at least a 6.2 kernel and currently have to 
tell our users that nvidia vgpu is not possible (due to above mentioned compile issues)

Any hints/tips what we might do wrong?

Thanks

0: https://docs.nvidia.com/grid/15.0/index.html

> 
> [Testcase]
> 
> Run SRIOV based (Ampere+) Nvidia vGPU on 6.2.x (Lunar) kernel.
> 
> Tarun Gupta (1):
>    UBUNTU: SAUCE: Add mdev_set_iommu_device() kABI.
> 
>   drivers/vfio/mdev/mdev_driver.c  |   1 +
>   drivers/vfio/mdev/mdev_private.h |   1 -
>   drivers/vfio/vfio_iommu_type1.c  | 126 ++++++++++++++++++++++++++++---
>   include/linux/mdev.h             |  22 ++++++
>   4 files changed, 140 insertions(+), 10 deletions(-)
> 





More information about the kernel-team mailing list