[Bug 16474] New: Boot hang, apparently blocking on SW RAID rebuild

bugzilla-daemon at bugzilla.ubuntu.com bugzilla-daemon at bugzilla.ubuntu.com
Tue Sep 27 18:02:21 UTC 2005


Please do not reply to this email.  You can add comments at
http://bugzilla.ubuntu.com/show_bug.cgi?id=16474
Ubuntu | kernel-package

           Summary: Boot hang, apparently blocking on SW RAID rebuild
           Product: Ubuntu
           Version: unspecified
          Platform: amd64
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: kernel-package
        AssignedTo: ben.collins at ubuntu.com
        ReportedBy: finley at anl.gov
         QAContact: kernel-bugs at lists.ubuntu.com


I am experiencing this issue on a new Sun Fire V40z Server in a well cooled
machine room.  (please refer to http://bugzilla.ubuntu.com/show_bug.cgi?id=10916)

Hardware:
- Quad CPU amd64 -- AMD Opteron(tm) Processor 852
- 32G memory
- 2x SCSI disks

RAID config:
- /dev/md0 RAID1
  Mount point: /boot
  Partitions:  /dev/sda1, /dev/sdb1
- /dev/md1 RAID1
  Physical device for LVM vg0
  $ mount | grep vg0
  /dev/mapper/vg0-root on / type ext3 (rw,errors=remount-ro)
  /dev/mapper/vg0-tmp on /tmp type ext3 (rw)
  /dev/mapper/vg0-var on /var type ext3 (rw)

I didn't try the "acpi=off" option, but was able to temporarily resolve the
situation in this way:
- multiple boots failed with hoary 2.6.10-5-amd64-k8-smp in the way described below
  - normal, just hit <Enter> boot failed
  - append "init=/bin/bash" boot boot failed
  - append "single" boot failed
- boot from "live" CD, then "watch cat /proc/mdstat" showed /dev/md1 re-syncing
- after the re-sync was complete, I was able to reboot with kernel 2.6.12.2-bef
  without incident (smp kernel)
- then tried booting again from 2.6.10-5-amd64-k8-smp, and also had success

Another point of potential interest is the file "/script" on the initrd, as this
is where the RAID arrays are assembled.  It contains the following for my system:

   mdadm -A /devfs/md/1 -R -u 55f3a23c:a0ef4950:a0a11bfe:250e3f63 /dev/sda2
/dev/sdb2
   mkdir /devfs/vg0
   mount_tmpfs /var
   if [ -f /etc/lvm/lvm.conf ]; then
   cat /etc/lvm/lvm.conf > /var/lvm.conf
   fi
   mount_tmpfs /etc/lvm
   if [ -f /var/lvm.conf ]; then
   cat /var/lvm.conf > /etc/lvm/lvm.conf
   fi
   mount -nt devfs devfs /dev
   vgchange -a y vg0
   umount /dev
   umount -n /var
   umount -n /etc/lvm
   ROOT=/dev/mapper/vg0-root
   mdadm -A /devfs/md/1 -R -u 55f3a23c:a0ef4950:a0a11bfe:250e3f63 /dev/sda2
/dev/sdb2

The machine is in use now, and I am unable to perform further tests on it. 
However, I will be receiving another one soon (identical, I believe), and will
be able to re-create
this problem and do further testing on it.  Please let me know if there are
tests you would like me to perform.

-- 
Configure bugmail: http://bugzilla.ubuntu.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.




More information about the kernel-bugs mailing list