[Bug 780492] Re: Boot failure with root on /dev/md* (raid)

Thu Sep 22 21:15:47 UTC 2011

This behaviour *is simple to reproduce* on Ubuntu LVM2 layout install:

As an example I have Ubunty natty with LVM2. Suppose vgs reports 20%
free in system volume group.

lvcreate -n test -L '15%' sysvolgroup # create new empty volume
...
pvcreate /dev/sysvolgroup/test # attach that volume as device
...
# all things go fine here, except until reboot, that when system boots, udev reports

"udevd[PID] worker [PID] unexpectedly returned with status 0x0100"

udev.log-priority=debug kernel parameter shows that the device at which
udevd timeouts -  [ /dev/dm-XX ] to which a link /dev/sysvolgroup/test
points, in this sample timeouts within scripts/local-premount at rule:

85-lvm2.rules:  RUN+="watershed sh -c '/sbin/lvm vgscan; /sbin/lvm
vgchange -a y'"

after timeout all go es well, again - until next boot.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/780492

Title:
  Boot failure with root on /dev/md* (raid)

Status in “mdadm” package in Ubuntu:
  Confirmed

Bug description:
  Binary package hint: mdadm

  Description:	Ubuntu 11.04
  Release:	11.04

  Package: mdadm
  Version: 3.1.4+8efb9d1ubuntu4.1 

  There are several bugs relating to boot failures when the root
  partition is on an mdadm managed RAID array, however none of them seem
  to describe the problem I am seeing with 11.04.

  Some context:
  - Ubuntu 11.04 x86_64 upgraded from 10.10 x86_64 using update-manager 
  - three raid arrays across 9 partitions on three physical disks - /dev/md125 (raid10), /dev/md126 (raid5), /dev/md127 (raid10)
  - root partition is /dev/md127p3 with /boot on /dev/md127p1
  - fstab and grub.cfg refer to partitions using UUID, not device names

  Expected behaviour:
  System boots from root partition on an mdadm managed raid array

  Actual behaviour:
  When booting with rroot partition on an mdadm managed raid array the process fails, with the udevd daemon hanging right after the init-premount and local-top scripts execute. The error message, repeatead for each array, is:

  "udevd[PID] worker [PID] unexpectedly returned with status 0x0100"

  From my debugging efforts it would seem that there is an issue with
  how mdadm 3.x communicates with udevd, or the other way round. While
  the arrays get detected, assembling them at boot causes udevd to hang.
  The only way to solve this problem for the moment was (for me at
  least) to force-downgrade mdadm from 3.1.4+8efb9d1ubuntu4.1 from Natty
  repos to 2.6.7.1-1ubuntu16 from Maverick repos and regenerating
  initrd. While much slower than 10.10, the OS boots in a workable state
  with all arrays properly assembled and healthy.

  Identical issue exists on my system when started from the LiveCD for 11.04.
  If I install mdadm from 11.04 repos *and* leave udev running the udev daemon hangs a few seconds after "mdadm --assemble --scan" is executed. If I disable the udev service first, all raid arrays are assembled and started within a second or two. Again, downgrading mdadm to 2.6.7.1-1ubuntu16 solves the problem.

  The very same setup previously worked flawless both in Ubuntu 10.04
  LTS and 10.10.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/780492/+subscriptions