[Bug 1187344] [NEW] BOOT_DEGRADED=true hangs at initramfs prompt with incomplete RAID0 array

Brian Candler 1187344 at bugs.launchpad.net
Tue Jun 4 11:11:37 UTC 2013


Public bug reported:

This is an ubuntu 12.04.2 x86_64 server.

It has "BOOT_DEGRADED=true" in /etc/initramfs-tools/conf.d/mdadm

It has two internal drives in a mirrored pair, and 22 SSDs in a RAID0
array.

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdx6[1] sdw5[0]
      361195328 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sdx7[1] sdw6[0]
      28303232 blocks super 1.2 [2/2] [UU]

md127 : active raid0 sds[18] sdr[17] sdq[16] sdu[21] sdt[19] sdo[14] sdv[20] sdp[15] sdl[1] sdn[13] sdk[2] sdj[11] sdm[0] sdh[9] sdg[8] sdi[10] sdc[6] sdf[7] sdd[12] sda[3] sde[5] sdb[4]
      11002314752 blocks super 1.2 16384k chunks

unused devices: <none>

The problem
==========

If one or more of the SSDs is not detected, boot hangs at the following
point:

--------
md127 : inactive sdr[17](S) ...

unused devices:<none>
Attempting to start the RAID in degraded mode...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
[   23.638700] md/raid0:md127: too few disks (18 of 22) - aborting!
[   23.638818] md: pers->run() failed ...
mdadm: failed to start array /dev/md/SSD: Invalid argument
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
Could not start the RAID in degraded mode.
Dropping to a shell.


BusyBox v1.18.5 (Ubuntu 1:1.18.5-1ubuntu4.1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)
--------

At this point it's possible to type "exit" to continue the boot, but
only if you have console access.

How to replicate
=============

This is easy to replicate, by removing one or more of the RAID0 drives
and rebooting.

It seems not to matter whether the array is listed in
/etc/mdadm/mdadm.conf. Originally it was not, but I tried

    /usr/share/mdadm/mkconf >/etc/mdadm/mdadm.conf

and it made no difference.

Desired behaviour
==============

If I have BOOT_DEGRADED=true then I expect the system to continue
booting unattended, not drop to initramfs prompt. Combined with
"nobootwait" option in /etc/fstab, the system should complete its
booting automatically.

Possibly there is a difference between a working but degraded array, and
an array with insufficient devices to start at all. This would include
things like a RAID5 with 2 or more disks missing, as well a RAID0 with 1
or more disks missing.

I still want the system to be able to boot under these conditions. This
would allow remote testing of the disks, reading the serial numbers of
the detected disks, rebuilding of the RAID0 array with fewer disks etc.

Updates
=======

To ensure this was not a recently-fixed issue, I did a full update and
repeated the test. The behaviour was unchanged.

The timestamp on the initrd file is newer than the timestamp on
conf.d/mdadm

root at ar24-5:/etc/initramfs-tools# ls -l /etc/initramfs-tools/conf.d/mdadm
-rw-r--r-- 1 root root 653 Jun  4 09:22 /etc/initramfs-tools/conf.d/mdadm
root at ar24-5:/etc/initramfs-tools# ls -l /boot/initrd.img-3.2.0-45-generic
-rw-r--r-- 1 root root 14650600 Jun  4 09:47 /boot/initrd.img-3.2.0-45-generic

** Affects: initramfs-tools (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: boot degraded initramfs mdadm mdraid raid

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1187344

Title:
  BOOT_DEGRADED=true hangs at initramfs prompt with incomplete RAID0
  array

Status in “initramfs-tools” package in Ubuntu:
  New

Bug description:
  This is an ubuntu 12.04.2 x86_64 server.

  It has "BOOT_DEGRADED=true" in /etc/initramfs-tools/conf.d/mdadm

  It has two internal drives in a mirrored pair, and 22 SSDs in a RAID0
  array.

  # cat /proc/mdstat
  Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
  md1 : active raid1 sdx6[1] sdw5[0]
        361195328 blocks super 1.2 [2/2] [UU]

  md0 : active raid1 sdx7[1] sdw6[0]
        28303232 blocks super 1.2 [2/2] [UU]

  md127 : active raid0 sds[18] sdr[17] sdq[16] sdu[21] sdt[19] sdo[14] sdv[20] sdp[15] sdl[1] sdn[13] sdk[2] sdj[11] sdm[0] sdh[9] sdg[8] sdi[10] sdc[6] sdf[7] sdd[12] sda[3] sde[5] sdb[4]
        11002314752 blocks super 1.2 16384k chunks

  unused devices: <none>

  The problem
  ==========

  If one or more of the SSDs is not detected, boot hangs at the
  following point:

  --------
  md127 : inactive sdr[17](S) ...

  unused devices:<none>
  Attempting to start the RAID in degraded mode...
  mdadm: CREATE user root not found
  mdadm: CREATE group disk not found
  [   23.638700] md/raid0:md127: too few disks (18 of 22) - aborting!
  [   23.638818] md: pers->run() failed ...
  mdadm: failed to start array /dev/md/SSD: Invalid argument
  mdadm: CREATE user root not found
  mdadm: CREATE group disk not found
  Could not start the RAID in degraded mode.
  Dropping to a shell.

  
  BusyBox v1.18.5 (Ubuntu 1:1.18.5-1ubuntu4.1) built-in shell (ash)
  Enter 'help' for a list of built-in commands.

  (initramfs)
  --------

  At this point it's possible to type "exit" to continue the boot, but
  only if you have console access.

  How to replicate
  =============

  This is easy to replicate, by removing one or more of the RAID0 drives
  and rebooting.

  It seems not to matter whether the array is listed in
  /etc/mdadm/mdadm.conf. Originally it was not, but I tried

      /usr/share/mdadm/mkconf >/etc/mdadm/mdadm.conf

  and it made no difference.

  Desired behaviour
  ==============

  If I have BOOT_DEGRADED=true then I expect the system to continue
  booting unattended, not drop to initramfs prompt. Combined with
  "nobootwait" option in /etc/fstab, the system should complete its
  booting automatically.

  Possibly there is a difference between a working but degraded array,
  and an array with insufficient devices to start at all. This would
  include things like a RAID5 with 2 or more disks missing, as well a
  RAID0 with 1 or more disks missing.

  I still want the system to be able to boot under these conditions.
  This would allow remote testing of the disks, reading the serial
  numbers of the detected disks, rebuilding of the RAID0 array with
  fewer disks etc.

  Updates
  =======

  To ensure this was not a recently-fixed issue, I did a full update and
  repeated the test. The behaviour was unchanged.

  The timestamp on the initrd file is newer than the timestamp on
  conf.d/mdadm

  root at ar24-5:/etc/initramfs-tools# ls -l /etc/initramfs-tools/conf.d/mdadm
  -rw-r--r-- 1 root root 653 Jun  4 09:22 /etc/initramfs-tools/conf.d/mdadm
  root at ar24-5:/etc/initramfs-tools# ls -l /boot/initrd.img-3.2.0-45-generic
  -rw-r--r-- 1 root root 14650600 Jun  4 09:47 /boot/initrd.img-3.2.0-45-generic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1187344/+subscriptions




More information about the foundations-bugs mailing list