[Bug 2007827] Re: flash-kernel failure when upgrading f-k anad kernel in the same cycle

Dave Jones 2007827 at bugs.launchpad.net
Fri Feb 24 15:39:15 UTC 2023


Hmm, this is a tricky one and there's a lot I don't like here.

The first thing I don't like is I probably caused this issue back in
focal :). I was fixing an issue of flash-kernel triggers getting
"forgotten" during flash-kernel updates (LP: #1667742).

The second thing I don't like is that we apparently haven't noticed this
(quite serious) issue in all the releases since. That's ... suspicious?
I would at least have expected this to have cropped up for some people
on the pi images which ought to be seeing something similar and are
quite widely used.

The third thing I'm not particularly keen on is the proposed fix: it
causes flash-kernel to exit *assuming* it'll be run in future without
actually guaranteeing that (contrast with the trigger deferral logic
near the top of main where flash-kernel triggers itself and then exits
to ensure it will definitely be run later). It also doesn't necessarily
account for the scenario where flash-kernel is being upgraded to fix an
issue in flash-kernel's historic behaviour. If the current kernel needs
re-installing anyway (because there was some defect in the way an old
version of f-k handled things), even when a new kernel version will be
installed later, that should still be done as we can't guarantee that
the new kernel's initrd will be generated successfully.

Hence, the two things we really want to avoid:

* Exiting without error when an error has actually occurred (i.e. the
initrd is missing, but it's *actually* missing not just "waiting to be
generated")

* Guaranteeing we don't "skip" flash-kernel executions when flash-kernel
itself is being upgraded, even when they might *seem* redundant (they're
only redundant in the case that everything works correctly, but the
error scenarios are valid edge cases)

I've spent a few hours digging into all the trigger logic that exists
between the kernel images, initramfs-tools (responsible for generating
the initrd), and flash-kernel. I'm reasonably convinced at this point
that we can't *prevent* flash-kernel from running at a point where a new
kernel isn't *completely* installed. The fact that flash-kernel *must*
run when it itself is upgraded means we're always vulnerable to being
run at that point.

However, flash-kernel is being run to install "the lastest version". It
uses "linux-version list" to get the installed kernels. That operation
lists the kernels which *exist* (as, say,
/boot/vmlinuz-5.15.0-1018-xilinx-zynqmp) under /boot, but doesn't care
whether the kernel package is actually *installed fully* by dpkg yet.

For example: /boot/vmlinuz-5.15.0-1018-xilinx-zynqmp exists because the
linux-image-5.15.0-1018-xilinx-zynqmp has been unpacked, but it must
still be in "triggers-awaiting" state because the initramfs-tools
trigger has not yet run to generate the corresponding initrd.

So ... I think the root of the issue here is that flash-kernel is not
being sufficiently discriminating of its selection of the "latest"
kernel; it considers not-fully-installed kernels to be valid for
installation. In fact, even if the initrd *does* exist, if triggers are
still pending it still shouldn't consider that a valid candidate (this
can occur if, for example, a linux-modules-extra package has been
removed so the initrd needs to be regenerated to remove the modules
provided by it).

Conclusion:

I should enhance the filters after "linux-version list" to exclude
kernels from packages which do not have a status of "Installed". One
drawback with this, in the scenario above, this will result in flash-
kernel "pointlessly" re-installing 5.15.0-1015-xilinx-zynqmp (the "old",
but current kernel) on its first "real" run, rather than silently
skipping it. However, I don't think that's an error: consider that the
flash-kernel upgrade may very well be correcting something in flash-
kernel which requires it to re-run for the current kernel.

This should still operate correctly in the event that f-k is used with "
--force" to install a specified version of the kernel; the filtration is
only used to determine the "latest" kernel and the system admin is still
free to override this with their choice of version (even unpackaged
versions that are manually installed in /boot) via the "--force" flag.

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/2007827

Title:
  flash-kernel failure when upgrading f-k anad kernel in the same cycle

Status in flash-kernel package in Ubuntu:
  New

Bug description:
  [Impact]
  In version 3.104ubuntu15 of flash-kernel, when both f-k and the kernel are upgraded in the same cycle, depending on the ordering of dpkg trigger execution, f-k may find the content of /boot "inconsistent" causing it to fail and return error exit status 1.

  Erorr message:
  Processing triggers for man-db (2.10.2-1) ...
  Processing triggers for flash-kernel (3.104ubuntu15) ...
  flash-kernel: installing version 5.15.0-1018-xilinx-zynqmp
  Initrd required for FIT method
  dpkg: error processing package flash-kernel (--configure):
   installed flash-kernel package post-installation script subprocess returned error exit status 1
  Processing triggers for linux-image-5.15.0-1018-xilinx-zynqmp (5.15.0-1018.20) ...
  /etc/kernel/postinst.d/initramfs-tools:
  update-initramfs: Generating /boot/initrd.img-5.15.0-1018-xilinx-zynqmp

  flash-kernel gets the latest kernel version by "linux-version list".
  When flash-kernel was triggered to generate fitimage, the kernel version is "5.15.0-1018" and the initrd for it wasn't ready. So, flash-kernel failed to generate the fitimage.

  A subsequent run of "apt install -f" fixed things because, by that
  point, the kernel's own trigger had executed, ensuring that update-
  initramfs had been run. In the case that f-k is run "prematurely" and
  finds itself in this situation (/boot/kernel-$[ver} exists, but
  /boot/initrd-${ver}) doesn't), it should probably bail out silently
  under the assumption that whatever is responsible for it will rectify
  the situation and trigger f-k again (as happens in the kernel postinst
  hooks).

  [Test Case]
  1. Flash an old image (with an out of date kernel and flash-kernel)
  2. sudo apt-get update
  3. sudo apt install flash-kernel with the fix and linux packages
  4. Upgrade should proceed without issue

  [Regression Potential]
  As with the previous flash-kernel uploads, it is possible that a breakage in the changed code can lead to issues with upgrading kernels (due to f-k being executed via a trigger at the end) or with Xilinx devices in the field not upgrading correctly. I will test all the changes extensively though.

  Related issues:
  LP: #1861292 flash-kernel failure during kernel upgrade

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/flash-kernel/+bug/2007827/+subscriptions




More information about the Ubuntu-sponsors mailing list