[Bug 925280] Re: Software RAID fails to rebuild after testing degraded cold boot
Jeff Lane
jeffrey.lane at canonical.com
Fri Feb 10 01:39:32 UTC 2012
OK... so I retried... Here is mdstat after installing and booting, and
waiting to ensure all syncing had completed:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda2[0] sdb2[1]
19529656 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sda1[0] sdb1[1]
48826296 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda3[0] sdb3[1]
175779768 blocks super 1.2 [2/2] [UU]
unused devices: <none>
Next, I shut down the system and remove disk 1. On reboot, I run mdstat
and note the degraded array with missing members:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active (auto-read-only) raid1 sda2[0]
19529656 blocks super 1.2 [2/1] [U_]
md0 : active raid1 sda1[0]
48826296 blocks super 1.2 [2/1] [U_]
md2 : active raid1 sda3[0]
175779768 blocks super 1.2 [2/1] [U_]
unused devices: <none>
Then I shut down and re-insert drive 2:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda2[0] sdb2[1]
19529656 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sda1[0]
48826296 blocks super 1.2 [2/1] [U_]
md2 : active raid1 sda3[0]
175779768 blocks super 1.2 [2/1] [U_]
unused devices: <none>
Then I try manually adding the disks per the test case:
bladernr at ubuntu"~$ sudo mdadm --add /dev/md0 /dev/sdb1
mdadm: /dev/sdb1 reports being an active member for /dev/md0, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sdb1 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdb1" first.
bladernr at ubuntu:~$ sudo mdadm --re-add /dev/md0 /dev/sdb1
mdadm: --re-add for /dev/sdb1 to /dev/md0 is not possible.
I got the same for /dev/md2 when trying to re-add /dev/sdb3, so I zero
the superblocks, which is essentially blanking the disk and adding it as
though it were a brand new disk into the array.
bladernr at ubuntu"~$ sudo mdadm --zero-superblock /dev/sdb3
bladernr at ubuntu"~$ sudo mdadm --zero-superblock /dev/sdb1
bladernr at ubuntu"~$ sudo mdadm --add /dev/md0 /dev/sdb1
mdadm: added /dev/sdb1
bladernr at ubuntu"~$ sudo mdadm --add /dev/md2 /dev/sdb3
mdadm: added /dev/sdb3
bladernr at ubuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda2[0] sdb2[1]
19529656 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdb1[2] sda1[0]
48826296 blocks super 1.2 [2/1] [U_]
[==>..................] recovery = 11.9% (5819456/48826296) finish=11.9min speed=59970K/sec
md2 : active raid1 sdb3[2] sda3[0]
175779768 blocks super 1.2 [2/1] [U_]
resync=DELAYED
unused devices: <none>
according to the test case, the most I should have to do is just plug
the disk back in and reboot the server which should cause mdadm to
automatically re-add the disk and start re-syncing. The most I should
have to do is just use the --add command to add the disk back in (or re-
add) manually.
What I am actually having to do is essentially destroy the partitions
for the ext4 LUNs and add them back in as brand new disks. This again,
did not happen to the SWAP md device which DID boot degraded, and then
re-connected automatically when I put the disk back in.
** Changed in: mdadm (Ubuntu)
Status: Incomplete => Confirmed
** Changed in: linux (Ubuntu)
Status: Incomplete => New
** Changed in: mdadm (Ubuntu)
Status: Confirmed => New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/925280
Title:
Software RAID fails to rebuild after testing degraded cold boot
Status in “linux” package in Ubuntu:
New
Status in “mdadm” package in Ubuntu:
New
Bug description:
Attempting the RAID install test with Precise server AMD64.
Hardware config is a 1U server with 2 SATA drives wth the following
partitions:
sda: 500GB SATA
sda1: 50GB RAID
sda2: 20GB RAID
sda3: 180GB RAID
sdb: 250GB SATA
sdb1: 50GB RAID
sdb2: 20GB RAID
sdb3: 180GB RAID
Using the instructions found here:
http://testcases.qa.ubuntu.com/Install/ServerRAID1
I created the three partitions for each physical disk. I then created
three RAID deviecs, md0 - md2 as follows:
md0: 50GB RAID1 using sda1 and sdb1 for /
md1: 20GB RAID1 using sda2 and sdb2 for swap
md2: 180GB RAID1 using sda3 and sdb3 for /home
I then completed the install and reboot. On the initial boot, I
verified that all three RAID devices were present and active:
bladernr at ubuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda1[0] sdb1[1]
48826296 blocks super 1.2 [2/2] [UU]
md2: active raid1 sda3[0] sdb3[1]
175838136 blocks super 1.2 [2/2] [UU]
md1: active raid1 sda2[0] sdb2[1]
19529656 blocks super 1.2 [2/2] [UU]
I then powered the machine down per the test case instructions,
removed disk 2 (sdb) and powered back up. On reboot, I verified that
the array was active and degraded and powered the system back down,
again per the test instructions.
I re-inserted drive2 (sdb) and powered the system up again. After
logging in, I rechecked /dev/mdstat, expecting to see both drives for
each md device and a resync in progress. Instead, I found that the
second drive was missing from md0 and md2 while md1 (the swap LUN) was
fine.
bladernr at ubuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda1[0]
48826296 blocks super 1.2 [2/1] [U_]
md2: active raid1 sda3[0]
175838136 blocks super 1.2 [2/1][U_]
md1: active raid1 sda2[0] sdb2[1]
19529656 blocks super 1.2 [2/2] [UU]
The instructions indicated that I may have to re-add the drives that
are missing manually, so I attemted this:
bladernr at ubuntu:~$ sudo mdadm --add /dev/md0 /dev/sdb1
mdadm: /dev/sdb1 reports being an active member for /dev/md0, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sdb1 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdb1" first.
I also tried using --re-add:
bladernr at ubuntu~$ sudo mdadm --re-add /dev/md0 /dev/sdb1
mdadm: --re-add for /dev/sdb1 to /dev/md0 is not possible
So here's some info from mdadm:
/dev/md0:
Version : 1.2
Creation Time : Wed Feb 1 20:53:34
Raid Level : raid1
Array Size : 48826296 (46.56 GiB 50.00GB)
Used Dev Size : 48826296 (46.56 GiB 50.00GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Update Time : Wed Feb 1 23:54:04 2012
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Name : ubuntu:0 (local to host ubuntu)
UUID : 118d60db:4ddc5cf2:040c4cb2:bd896eaf
Events : 118
Number Major Minor RaidDevices State
0 8 1 0 active sync /dev/sda1
1 0 0 1 removed
So according to the test instructions, this test is a failure because
I can't rebuild the array (nor is it automatically rebuilt).
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-12-generic 3.2.0-12.21
ProcVersionSignature: Ubuntu 3.2.0-12.21-generic 3.2.2
Uname: Linux 3.2.0-12-generic x86_64
AlsaDevices:
total 0
crw-rw---T 1 root audio 116, 1 Feb 1 23:35 seq
crw-rw---T 1 root audio 116, 33 Feb 1 23:35 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 1.91-0ubuntu1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Date: Wed Feb 1 23:38:28 2012
HibernationDevice: RESUME=UUID=e573077c-98b5-42e5-9f37-b8efaa2ba74a
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120201.1)
IwConfig:
lo no wireless extensions.
eth1 no wireless extensions.
eth0 no wireless extensions.
MachineType: Supermicro X7DVL
PciMultimedia:
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-12-generic root=UUID=a84486b9-e72d-4134-82a8-263f91d7d894 ro
RelatedPackageVersions:
linux-restricted-modules-3.2.0-12-generic N/A
linux-backports-modules-3.2.0-12-generic N/A
linux-firmware 1.68
RfKill: Error: [Errno 2] No such file or directory
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/23/2008
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 2.1
dmi.board.name: X7DVL
dmi.board.vendor: Supermicro
dmi.board.version: PCB Version
dmi.chassis.type: 1
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 0123456789
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr2.1:bd06/23/2008:svnSupermicro:pnX7DVL:pvr0123456789:rvnSupermicro:rnX7DVL:rvrPCBVersion:cvnSupermicro:ct1:cvr0123456789:
dmi.product.name: X7DVL
dmi.product.version: 0123456789
dmi.sys.vendor: Supermicro
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/925280/+subscriptions
More information about the foundations-bugs
mailing list