[SRU][canonical-kernel-snap/main][kernel-snaps-uc24.04/pc][PATCH v4 0/1] add nvidia-550 driver components

Aaron Jauregui aaron.jauregui at canonical.com
Wed Jan 8 22:33:01 UTC 2025


BugLink: https://bugs.launchpad.net/bugs/2088970

Note: This patchset should be applied once snapd 2.67 is in
latest/stable. This is not the case yet at the time of writing but
likely will be at the time of review

[Changes between v3 and v4]
- Patchset no longer RFC, as snapd component support should be landing
  in latest/stable.

- Fixed a bug in the build process that could result in version
  mismatches of nvidia libraries between the ko and user components
  depending on the status of the deb archives for each. This required an
  overhaul of the build scripts. This is intended as a temporary solution,
  and I intend to replace it with an swm template-based approach soon.

- Added a cleanup stanza for the nvidia-550-user component. Testing
  discovered issues with conflicts created by mesa libraries that
  should not be present in the component, so these are pruned.

[Changes between v2 and v3]
  - cleaned up scripts and comments
  - added better summaries/descriptions for components
  - restructured hooks directory into per-component subdirectories
  - shifted organize blocks to respective components
  - replaced post-refresh hook copying with corresponding files in the
    hooks directory (note: snapcraft did not support using symlinks for
    this)
  - updated [Impact] section with more detailed information

[Changes between v1 and v2]

kernel-snaps-u24.04:
  - replaced TODO HACK FOR HOOKS
  - updated nvidia userspace component type to standard

hooks:
  - included install hook for the pc-kernel to install nouveau
    component by default

[Impact]
Snap components are a way to have optional content for snaps available
for install without resorting to building a completely new snap. It's
useful to think of them as lazy loading for snaps. Concretely, components
are themselves snaps with locked-down functionality that are mounted
within their parent snap's filesystem. Component revisions are tied 1 to
1 with their parent snap revision at upload time, meaning that any refresh
also refreshes the components tied to the snap. This also means that
components MUST be uploaded alongside the parent snap, or the store will
reject the upload.

We use components here with the aim of providing a way for nvidia
drivers to be selected for the pc-kernel without having to rebuild,
targetting the nvidia-550 driver as a starting point with the aim of
supporting more driver versions in the future. Since nouveau, currently
included in the pc-kernel, conflicts with nvidia, we replace the nouveau
.ko with a component compatible with the nvidia component scheme.

Images are intended to be built either with a nvidia graphics component
either preloaded by being declared in the model, or at first boot, where the
pc-kernel's install hook will detect that no nvidia graphics component exists
and download nouveau. This should cover the uppgrade case for users of the
existing pc-kernel that rely on nouveau. I am working on further functionality
allowing for snap set to configure the desired graphics version and either
configure it (if the component exists on disk) or to fetch it from the store.

The implemented components rely on install, refresh, and remove hooks
for the respective functionality. These are intended to be placed in
canonical-kernel-snaps. All 3 implemented components have a post-refresh hook
that is identical to their install hook.

  - For the nvidia-ko and nouveau hooks, these hooks copy the corresponding
    kernel modules to $SNAP_DATA/$(uname -r)/graphics, with the nvidia-ko hook
    linking the modules and attaching their module signatures before moving
    them. Both components' remove hooks delete the graphics directory. The
    nouveau hooks are intended to be generic kernel module component hooks
    with a special case for nouveau.

  - For the nvidia-user hooks, a directory corresponding to the kernel-gpu-2404
    interface is created. A sentinel file is placed in the directory to be able
    to notify the consumer of the kernel-gpu-2404 interface of file changes
    (e.g. a refresh). All the libraries in the component are copied into this
    directory. A mangler script is added for the consuming snap to have the
    correct environment variables present when using the provided libraries.

Nvidia components are mostly self-contained, but a few changes to the pc-kernel
snap were required. files/meta/kernel.yaml is required to enable kernel
module support in snapd. The kernel-gpu-2404 content interface is
declared for exposing nvidia userspace libraries, and is not intended to
be accessed directly by users.

The current test plan on our end to my understanding includes smoke testing for
both the ubuntu core use case and the hybrid case with tpm-backed fde (emulated
through kvm, as hardware testing is currently not functional for tpm-backed
fde).  Cert should be in charge of testing beyond this, but I don't have much
information about this yet. I should be able to confirm this and explain the
test plan in better detail early next week.

[Test case]
Nvidia components can be installed as follows:

 $ snap install pc-kernel+nvidia-550-ko pc-kernel+nvidia-550-user

The components install their files in $SNAP_DATA/modules/$(uname -r)/graphics

[Regression potential]
There is potential for regressions to be introduced by the pc-kernel install
hook, as it is executed on every install and and refresh event. If this
script fails, the installation or update of the snap will abort.

Aaron Jauregui (1):
  snapcraft.yaml: Add nvidia-550 and nouveau component support

 files/meta/kernel.yaml |   1 +
 nvidia_packages        |  11 ++
 snapcraft.yaml         | 225 ++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 236 insertions(+), 1 deletion(-)
 create mode 100644 files/meta/kernel.yaml
 create mode 100644 nvidia_packages

Aaron Jauregui (1):
  nvidia-hooks: add hooks for nvidia kernel components

 hooks/module/install.module                   | 21 ++++++++++++++++
 hooks/module/post-refresh.module              | 21 ++++++++++++++++
 hooks/module/remove.module                    | 13 ++++++++++
 hooks/nvidia-ko/install.nvidia-ko             | 25 +++++++++++++++++++
 hooks/nvidia-ko/post-refresh.nvidia-ko        | 25 +++++++++++++++++++
 hooks/nvidia-ko/remove.nvidia-ko              |  6 +++++
 hooks/nvidia-user/install.nvidia-user         | 18 +++++++++++++
 .../kernel-gpu-2404-provider-mangler          | 12 +++++++++
 hooks/nvidia-user/remove.nvidia-user          | 10 ++++++++
 hooks/pc-kernel/install.pc-kernel             |  6 +++++
 hooks/pc-kernel/post-refresh.pc-kernel        |  6 +++++
 11 files changed, 163 insertions(+)
 create mode 100644 hooks/module/install.module
 create mode 100644 hooks/module/post-refresh.module
 create mode 100644 hooks/module/remove.module
 create mode 100644 hooks/nvidia-ko/install.nvidia-ko
 create mode 100644 hooks/nvidia-ko/post-refresh.nvidia-ko
 create mode 100644 hooks/nvidia-ko/remove.nvidia-ko
 create mode 100644 hooks/nvidia-user/install.nvidia-user
 create mode 100644 hooks/nvidia-user/kernel-gpu-2404-provider-mangler
 create mode 100644 hooks/nvidia-user/remove.nvidia-user
 create mode 100644 hooks/pc-kernel/install.pc-kernel
 create mode 100644 hooks/pc-kernel/post-refresh.pc-kernel

-- 
2.43.0




More information about the kernel-team mailing list