Noble /boot mount failures
Krister Johansen
kjlx at templeofstupid.com
Wed Dec 4 18:23:24 UTC 2024
Hi,
I wanted to raise awareness about a bug that my team has been running
into on Noble. I logged a launchpad bug here with the details:
https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/2090972
The short explanation is that libblkid in util-linux can misread the
ext4 superblock checksum and incorrectly decide that it doesn't match.
If it's systemd-udevd using libblkid to do this, then that can result in
udev removing devlinks that it thinks don't exist any longer, even
though the device is still present. (E.g. if libblkid doesn't find a
filesystem, then it doesn't return the label property, and a label
property disappearing can result in udev removing that devlink).
The mount of /boot fails, because it's trying to use a
/dev/disk/by-label path, and that path sometimes disappears because of
the bug. This subsequently results in packaging operations and anything
else that interacts with /boot failing. (Usually update-grub related).
We've also seen the root disk have this problem, though the failure
there was more subtle. The device mounted, but a subsequent mkinitramfs
failed during a kernel package install, because of missing
/dev/disk/by-label links.
I've fixed the bug in libblkid and gotten the fix merged upstream.
We've also patched this locally in our own version of util-linux. I've
attached a patch to the bug report which is the version of the fix we
used against the Noble packages. We haven't seen the problem reproduce
since rolling this out.
The underlying problem affects util-linux >= 2.39. Would it be possible
for somebody to pull this patch into the affected Ubuntu util-linux
versions?
Thanks,
-K
More information about the Ubuntu-devel-discuss
mailing list