[Bug 1879980] Re: Fail to boot with LUKS on top of RAID1 if the array is broken/degraded
John Gilmore
1879980 at bugs.launchpad.net
Mon Nov 2 11:00:19 UTC 2020
I am running a laptop with an internal nvme that is partitioned to
contain an encrypted LUKS partition that contains an LVM2 PV. That PV
contains part of an LV (lvubuntu) which also includes a partition
residing on an external USB drive. The theory is that I'll make part of
the external PV into a RAID-1 image of the internal partition, which
will automatically back up the internal nvme whenever I plug in that
drive, making it a USB-bootable bit-for-bit copy. (That part doesn't
work yet because of a blocksize mismatch between the 4k hard drive and
the 512b nvme partition; I'll need to dump, reformat, and restore the
nvme LV into 4k blocks before LVM will succeed in making it part of a
RAID set.)
I have been able to boot this laptop (without the external drive
attached) even prior to this bug fix. I can't tell you exactly how it
worked, but it worked, possibly because the RAID1 isn't set up yet;
instead the LV just includes both PV's.
I upgraded to the released fix for this bug in 20.04, as part of
installing all current security patches. Now when I boot the system, it
detects the encrypted partition, asks for the password, gets it, and
succeeds in fsck-ing the root partition. But then it loops and fsck's
it again. And again. And again. (I noticed this by the long delay
with the circling icon on the splash screen. I switched to ctrl-alt-F1
to see console messages and there I see it doing the fsck over and
over.) Eventually THIS behavior times out too, and the system does
boot.
But I suspect that this is not exactly what you wanted this patch to do.
It used to boot very quickly, now after the patch for this bugfix, it
doesn't.
I'm running 20.04.1 on an amd64 (Lenovo Ideapad Flex 5 14", AMD Ryzen 5
4500U) laptop.
# pvdisplay
WARNING: Couldn't find device with uuid u2T6W5-rI3L-40C8-rZc3-uS6H-rMsV-U58ic5.
WARNING: VG vgubuntu is missing PV u2T6W5-rI3L-40C8-rZc3-uS6H-rMsV-U58ic5 (last written to /dev/mapper/luks-d43f3ec2-2633-46d0-bd1b-1c9f46e13c4d).
--- Physical volume ---
PV Name /dev/mapper/nvme0n1p3_crypt
VG Name vgubuntu
PV Size 237.24 GiB / not usable 0
Allocatable yes
PE Size 4.00 MiB
Total PE 60734
Free PE 253
Allocated PE 60481
PV UUID SQ4XNm-dKJQ-ild3-dx2k-lrGL-fqay-5rY8nY
--- Physical volume ---
PV Name [unknown]
VG Name vgubuntu
PV Size <3.64 TiB / not usable 4.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 953610
Free PE 817930
Allocated PE 135680
PV UUID u2T6W5-rI3L-40C8-rZc3-uS6H-rMsV-U58ic5
# lvdisplay
WARNING: Couldn't find device with uuid u2T6W5-rI3L-40C8-rZc3-uS6H-rMsV-U58ic5.
WARNING: VG vgubuntu is missing PV u2T6W5-rI3L-40C8-rZc3-uS6H-rMsV-U58ic5 (last written to /dev/mapper/luks-d43f3ec2-2633-46d0-bd1b-1c9f46e13c4d).
--- Logical volume ---
LV Path /dev/vgubuntu/root
LV Name root
VG Name vgubuntu
LV UUID Dy2wIe-nhEd-BZN5-i3bg-AU3P-uenQ-fuPkSR
LV Write Access read/write
LV Creation host, time ubuntu, 2020-07-16 16:55:07 -0700
LV Status available
# open 1
LV Size 236.25 GiB
Current LE 60481
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:1
--- Logical volume ---
LV Path /dev/vgubuntu/root4tb
LV Name root4tb
VG Name vgubuntu
LV UUID KFe7kE-bPOs-md1j-iVqT-kBQy-mj1U-XTpfM9
LV Write Access read/write
LV Creation host, time pad.toad.com, 2020-09-03 01:11:24 -0700
LV Status NOT available
LV Size 230.00 GiB
Current LE 58880
Segments 1
Allocation inherit
Read ahead sectors auto
--- Logical volume ---
LV Path /dev/vgubuntu/misc
LV Name misc
VG Name vgubuntu
LV UUID QKd7pK-CfdG-VxFb-VThb-QzWy-oKVO-NQdchx
LV Write Access read/write
LV Creation host, time pad.toad.com, 2020-09-03 01:40:22 -0700
LV Status NOT available
LV Size 300.00 GiB
Current LE 76800
Segments 1
Allocation inherit
Read ahead sectors auto
#
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1879980
Title:
Fail to boot with LUKS on top of RAID1 if the array is broken/degraded
Status in cryptsetup package in Ubuntu:
Fix Released
Status in initramfs-tools package in Ubuntu:
Fix Released
Status in mdadm package in Ubuntu:
Opinion
Status in cryptsetup source package in Xenial:
Won't Fix
Status in initramfs-tools source package in Xenial:
Won't Fix
Status in mdadm source package in Xenial:
Won't Fix
Status in cryptsetup source package in Bionic:
Fix Released
Status in initramfs-tools source package in Bionic:
Fix Released
Status in mdadm source package in Bionic:
Opinion
Status in cryptsetup source package in Focal:
Fix Released
Status in initramfs-tools source package in Focal:
Fix Released
Status in mdadm source package in Focal:
Opinion
Status in cryptsetup source package in Groovy:
Fix Released
Status in initramfs-tools source package in Groovy:
Fix Released
Status in mdadm source package in Groovy:
Opinion
Status in cryptsetup package in Debian:
New
Bug description:
[Impact]
* Considering a setup of a encrypted rootfs on top of md RAID1 device, Ubuntu is currently unable to decrypt the rootfs if the array gets degraded, like for example if one of the array's members gets removed.
* The problem has 2 main aspects: first, cryptsetup initramfs script
attempts to decrypt the array only in the local-top boot stage, and in
case it fails, it gives-up and show user a shell (boot is aborted).
* Second, mdadm initramfs script that assembles degraded arrays
executes later on boot, in the local-block stage. So, in a stacked
setup of encrypted root on top of RAID, if the RAID is degraded,
cryptsetup fails early in the boot, preventing mdadm to assemble the
degraded array.
* The hereby proposed solution has 2 components: first, cryptsetup
script is modified to allow a gentle failure on local-top stage, then
it retries for a while (according to a heuristic based on ROOTDELAY
with minimum of 30 executions) in a later stage (local-block). This
gives time to other initramfs scripts to run, like mdadm in local-
block stage. And this is meant to work this way according to
initramfs-tools documentation (although Ubuntu changed it a bit with
wait-for-root, hence we stopped looping on local-block, see next
bullet).
* Second, initramfs-tools was adjusted - currently, it runs for a
while the mdadm local-block script, in order to assemble the arrays in
a non-degraded mode. We extended this approach to also execute
cryptsetup, in a way that after mdadm ends its execution, we execute
at least once more time cryptsetup. In an ideal world we should loop
on local-block as Debian's initramfs (in a way to remove hardcoded
mdadm/cryptsetup mentions from initramfs-tools code), but this would
be really a big change, non-SRUable probably. I plan to work that for
future Ubuntu releases.
[Test case]
* Install Ubuntu in a Virtual Machine with 2 disks. Use the installer to create a RAID1 volume and an encrypted root on top of it.
* Boot the VM, and use "sgdisk"/"wipefs" to erase the partition table
from one of the RAID members. Reboot and it will fail to mount rootfs
and continue boot process.
* If using the initramfs-toos/cryptsetup patches hereby proposed, the
rootfs can be mounted normally.
[Regression potential]
* There are potential for regressions, since this is a change in 2
boot components. The patches were designed in a way to keep the
regular case working, it changes the failure case which is not
currently working anyway.
* A modification in the behavior of cryptsetup was introduced: right
now, if we fail the password 3 times (the default maximum attempts),
the script doesn't "panic" and drop to a shell immediately; instead it
runs once more (or twice, if mdadm is installed) before failing. This
is a minor change given the benefit of the being able to mount rootfs
in a degraded RAID1 scenario.
* Other potential regressions could show-up as boot problems, but the
change in initramfs-tools specifically is not invasive, it just may
delay boot time a bit, given we now run cryptsetup multiple times on
local-block, with 1 sec delays between executions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1879980/+subscriptions
More information about the foundations-bugs
mailing list