[Bug 1918080] [NEW] Suggested fix: 10_linux_zfs will not detect kernels if /boot/grub/grubenv exists on the root dataset

Sam Van den Eynde 1918080 at bugs.launchpad.net
Mon Mar 8 00:02:29 UTC 2021


Public bug reported:

Since a few weeks update-grub was no longer detecting my root-on-zfs
install. This system does not follow the bpool/rpool logic as it got
installed a long time ago.

I was able to pinpoint the issue to this part of the 10_linux_zfs
script:

    if [ -n "$(ls ${candidate_path} 2>/dev/null)" ]; then
        echo "${candidate_path}"
        return
    fi

This code seems to identify candidate locations for /boot directories,
and expects them to be empty. ZFS does not require /boot to be empty to
mount the target boot dataset on it (overlay=on), but I assume this way
some other candidate paths can easily be skipped quickly.

So I think I basically ran into this issue:
https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1881442.

When grubenv gets created on the zfs root dataset during a failed boot
sequence, 10_linux_zfs will skip it from then on, leaving grub.cfg
without valid boot entries.

I was able to reproduce the issue by creating and removing /boot/grub on
my root dataset, and fix it by allowing the 10_linux_zfs script to
continue if only grub exists for candidate_path:

    if [ -n "$(ls ${candidate_path} 2>/dev/null)" ] && [ "$(ls ${candidate_path} 2>/dev/null)" != "grub" ]; then
        echo "${candidate_path}"
        return
    fi

Even when the grub-initrd-fallback.service bug gets addressed, I would
allow for /boot/grub to exist on the root dataset. If not, that bug
becomes quite critical so it seems.

** Affects: grub2 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: grub zfs

** Description changed:

  Since a few weeks update-grub was no longer detecting my root-on-zfs
  install. This system does not follow the bpool/rpool logic as it got
  installed a long time ago.
  
  I was able to pinpoint the issue to this part of the 10_linux_zfs
  script:
  
-     if [ -n "$(ls ${candidate_path} 2>/dev/null)" ]; then
-         echo "${candidate_path}"
-         return
-     fi
+     if [ -n "$(ls ${candidate_path} 2>/dev/null)" ]; then
+         echo "${candidate_path}"
+         return
+     fi
  
  This code seems to identify candidate locations for /boot directories,
  and expects them to be empty. ZFS does not require /boot to be empty to
  mount the target boot dataset on it (overlay=on), but I assume this way
  some other candidate paths can easily be skipped quickly.
  
  So I think I basically ran into this issue:
  https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1881442.
  
  When grubenv gets created on the zfs root dataset during a failed boot
  sequence, 10_linux_zfs will skip it from then on, leaving grub.cfg
  without valid boot entries.
  
  I was able to reproduce the issue by creating and removing /boot/grub on
  my root dataset, and fix it by allowing the 10_linux_zfs script to
  continue if only grub exists for candidate_path:
  
-     if [ -n "$(ls ${candidate_path} 2>/dev/null)" ] && [ "$(ls ${candidate_path} 2>/dev/null)" != "grub" ]; then
-         echo "${candidate_path}"
-         return
-     fi
- 
- Which fixes it for me.
+     if [ -n "$(ls ${candidate_path} 2>/dev/null)" ] && [ "$(ls ${candidate_path} 2>/dev/null)" != "grub" ]; then
+         echo "${candidate_path}"
+         return
+     fi
  
  Even when the grub-initrd-fallback.service bug gets addressed, I would
  allow for /boot/grub to exist on the root dataset. If not, that bug
  becomes quite critical so it seems.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to grub2 in Ubuntu.
https://bugs.launchpad.net/bugs/1918080

Title:
  Suggested fix: 10_linux_zfs will not detect kernels if
  /boot/grub/grubenv exists on the root dataset

Status in grub2 package in Ubuntu:
  New

Bug description:
  Since a few weeks update-grub was no longer detecting my root-on-zfs
  install. This system does not follow the bpool/rpool logic as it got
  installed a long time ago.

  I was able to pinpoint the issue to this part of the 10_linux_zfs
  script:

      if [ -n "$(ls ${candidate_path} 2>/dev/null)" ]; then
          echo "${candidate_path}"
          return
      fi

  This code seems to identify candidate locations for /boot directories,
  and expects them to be empty. ZFS does not require /boot to be empty
  to mount the target boot dataset on it (overlay=on), but I assume this
  way some other candidate paths can easily be skipped quickly.

  So I think I basically ran into this issue:
  https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1881442.

  When grubenv gets created on the zfs root dataset during a failed boot
  sequence, 10_linux_zfs will skip it from then on, leaving grub.cfg
  without valid boot entries.

  I was able to reproduce the issue by creating and removing /boot/grub
  on my root dataset, and fix it by allowing the 10_linux_zfs script to
  continue if only grub exists for candidate_path:

      if [ -n "$(ls ${candidate_path} 2>/dev/null)" ] && [ "$(ls ${candidate_path} 2>/dev/null)" != "grub" ]; then
          echo "${candidate_path}"
          return
      fi

  Even when the grub-initrd-fallback.service bug gets addressed, I would
  allow for /boot/grub to exist on the root dataset. If not, that bug
  becomes quite critical so it seems.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1918080/+subscriptions



More information about the foundations-bugs mailing list