[Bug 1187344] [NEW] BOOT_DEGRADED=true hangs at initramfs prompt with incomplete RAID0 array
Brian Candler
1187344 at bugs.launchpad.net
Tue Jun 4 11:11:37 UTC 2013
Public bug reported:
This is an ubuntu 12.04.2 x86_64 server.
It has "BOOT_DEGRADED=true" in /etc/initramfs-tools/conf.d/mdadm
It has two internal drives in a mirrored pair, and 22 SSDs in a RAID0
array.
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdx6[1] sdw5[0]
361195328 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdx7[1] sdw6[0]
28303232 blocks super 1.2 [2/2] [UU]
md127 : active raid0 sds[18] sdr[17] sdq[16] sdu[21] sdt[19] sdo[14] sdv[20] sdp[15] sdl[1] sdn[13] sdk[2] sdj[11] sdm[0] sdh[9] sdg[8] sdi[10] sdc[6] sdf[7] sdd[12] sda[3] sde[5] sdb[4]
11002314752 blocks super 1.2 16384k chunks
unused devices: <none>
The problem
==========
If one or more of the SSDs is not detected, boot hangs at the following
point:
--------
md127 : inactive sdr[17](S) ...
unused devices:<none>
Attempting to start the RAID in degraded mode...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
[ 23.638700] md/raid0:md127: too few disks (18 of 22) - aborting!
[ 23.638818] md: pers->run() failed ...
mdadm: failed to start array /dev/md/SSD: Invalid argument
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
Could not start the RAID in degraded mode.
Dropping to a shell.
BusyBox v1.18.5 (Ubuntu 1:1.18.5-1ubuntu4.1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
(initramfs)
--------
At this point it's possible to type "exit" to continue the boot, but
only if you have console access.
How to replicate
=============
This is easy to replicate, by removing one or more of the RAID0 drives
and rebooting.
It seems not to matter whether the array is listed in
/etc/mdadm/mdadm.conf. Originally it was not, but I tried
/usr/share/mdadm/mkconf >/etc/mdadm/mdadm.conf
and it made no difference.
Desired behaviour
==============
If I have BOOT_DEGRADED=true then I expect the system to continue
booting unattended, not drop to initramfs prompt. Combined with
"nobootwait" option in /etc/fstab, the system should complete its
booting automatically.
Possibly there is a difference between a working but degraded array, and
an array with insufficient devices to start at all. This would include
things like a RAID5 with 2 or more disks missing, as well a RAID0 with 1
or more disks missing.
I still want the system to be able to boot under these conditions. This
would allow remote testing of the disks, reading the serial numbers of
the detected disks, rebuilding of the RAID0 array with fewer disks etc.
Updates
=======
To ensure this was not a recently-fixed issue, I did a full update and
repeated the test. The behaviour was unchanged.
The timestamp on the initrd file is newer than the timestamp on
conf.d/mdadm
root at ar24-5:/etc/initramfs-tools# ls -l /etc/initramfs-tools/conf.d/mdadm
-rw-r--r-- 1 root root 653 Jun 4 09:22 /etc/initramfs-tools/conf.d/mdadm
root at ar24-5:/etc/initramfs-tools# ls -l /boot/initrd.img-3.2.0-45-generic
-rw-r--r-- 1 root root 14650600 Jun 4 09:47 /boot/initrd.img-3.2.0-45-generic
** Affects: initramfs-tools (Ubuntu)
Importance: Undecided
Status: New
** Tags: boot degraded initramfs mdadm mdraid raid
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1187344
Title:
BOOT_DEGRADED=true hangs at initramfs prompt with incomplete RAID0
array
Status in “initramfs-tools” package in Ubuntu:
New
Bug description:
This is an ubuntu 12.04.2 x86_64 server.
It has "BOOT_DEGRADED=true" in /etc/initramfs-tools/conf.d/mdadm
It has two internal drives in a mirrored pair, and 22 SSDs in a RAID0
array.
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdx6[1] sdw5[0]
361195328 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdx7[1] sdw6[0]
28303232 blocks super 1.2 [2/2] [UU]
md127 : active raid0 sds[18] sdr[17] sdq[16] sdu[21] sdt[19] sdo[14] sdv[20] sdp[15] sdl[1] sdn[13] sdk[2] sdj[11] sdm[0] sdh[9] sdg[8] sdi[10] sdc[6] sdf[7] sdd[12] sda[3] sde[5] sdb[4]
11002314752 blocks super 1.2 16384k chunks
unused devices: <none>
The problem
==========
If one or more of the SSDs is not detected, boot hangs at the
following point:
--------
md127 : inactive sdr[17](S) ...
unused devices:<none>
Attempting to start the RAID in degraded mode...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
[ 23.638700] md/raid0:md127: too few disks (18 of 22) - aborting!
[ 23.638818] md: pers->run() failed ...
mdadm: failed to start array /dev/md/SSD: Invalid argument
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
Could not start the RAID in degraded mode.
Dropping to a shell.
BusyBox v1.18.5 (Ubuntu 1:1.18.5-1ubuntu4.1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
(initramfs)
--------
At this point it's possible to type "exit" to continue the boot, but
only if you have console access.
How to replicate
=============
This is easy to replicate, by removing one or more of the RAID0 drives
and rebooting.
It seems not to matter whether the array is listed in
/etc/mdadm/mdadm.conf. Originally it was not, but I tried
/usr/share/mdadm/mkconf >/etc/mdadm/mdadm.conf
and it made no difference.
Desired behaviour
==============
If I have BOOT_DEGRADED=true then I expect the system to continue
booting unattended, not drop to initramfs prompt. Combined with
"nobootwait" option in /etc/fstab, the system should complete its
booting automatically.
Possibly there is a difference between a working but degraded array,
and an array with insufficient devices to start at all. This would
include things like a RAID5 with 2 or more disks missing, as well a
RAID0 with 1 or more disks missing.
I still want the system to be able to boot under these conditions.
This would allow remote testing of the disks, reading the serial
numbers of the detected disks, rebuilding of the RAID0 array with
fewer disks etc.
Updates
=======
To ensure this was not a recently-fixed issue, I did a full update and
repeated the test. The behaviour was unchanged.
The timestamp on the initrd file is newer than the timestamp on
conf.d/mdadm
root at ar24-5:/etc/initramfs-tools# ls -l /etc/initramfs-tools/conf.d/mdadm
-rw-r--r-- 1 root root 653 Jun 4 09:22 /etc/initramfs-tools/conf.d/mdadm
root at ar24-5:/etc/initramfs-tools# ls -l /boot/initrd.img-3.2.0-45-generic
-rw-r--r-- 1 root root 14650600 Jun 4 09:47 /boot/initrd.img-3.2.0-45-generic
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1187344/+subscriptions
More information about the foundations-bugs
mailing list