[Bug 945786] Re: mdadm refuses to re-add failed member
iMac
945786 at bugs.launchpad.net
Sat Mar 3 19:14:06 UTC 2012
Still weird on reboot. mdstat seems fine, but the failed member still
thinks it is active.
:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdb3[0]
86003840 blocks [3/1] [U__]
unused devices: <none>
:~# mdadm -D /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Array Size : 86003840 (82.02 GiB 88.07 GB)
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Sat Mar 3 14:12:24 2012
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Events : 0.10187219
Number Major Minor RaidDevice State
0 8 19 0 active sync /dev/sdb3
1 0 0 1 removed
2 0 0 2 removed
:~# mdadm -Q --examine /dev/sda6
/dev/sda6:
Magic : a92b4efc
Version : 0.90.00
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Array Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Update Time : Sat Mar 3 13:28:57 2012
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 60f50ddb - correct
Events : 10128612
Number Major Minor RaidDevice State
this 1 8 6 1 active sync /dev/sda6
0 0 0 0 0 removed
1 1 8 6 1 active sync /dev/sda6
2 2 0 0 2 faulty removed
:~# mdadm -Q --examine /dev/sdb3
/dev/sdb3:
Magic : a92b4efc
Version : 0.90.00
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Array Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Update Time : Sat Mar 3 14:12:34 2012
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 2
Spare Devices : 0
Checksum : 60f6e218 - correct
Events : 10187225
Number Major Minor RaidDevice State
this 0 8 19 0 active sync /dev/sdb3
0 0 8 19 0 active sync /dev/sdb3
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/945786
Title:
mdadm refuses to re-add failed member
Status in “mdadm” package in Ubuntu:
New
Bug description:
I have my /home in a three-disk RAID1 configuration (/dev/md1) with a
partition on my laptop and a second on an external disk connected via
eSATA; A third sits on a third external disk. I booted up with two
members degraded (external drive not plugged in) and prior to login,
proceeded to use a console to umount, remove and fail the active
drive (internal partition member) and stop the RAID1 disk, and then
plug in my external, re-starting the /dev/md1 device with the external
partition member active and remounting /home. The process is one I
have executed many times before and is scripted from a couple of files
in /usr/local/bin.
However, this time after logging in with my external member active
after executing the process above, and attempting to re-add the
internal drive to bring the /dev/md1 device in sync with the external
disk I received an error suggesting the add failed. I re-executed the
remove, fail, re-add manually with the same outcome as shown on my
console below, and filed this bug.
It seems the failed disk thinks it is still active, when I use -Q
--examine to interrogate it.
:~# mdadm /dev/md1 -r /dev/sda6
mdadm: hot remove failed for /dev/sda6: No such device or address
:~# mdadm /dev/md1 -f /dev/sda6
mdadm: set device faulty failed for /dev/sda6: No such device
:~# mdadm /dev/md1 -a /dev/sda6
mdadm: /dev/sda6 reports being an active member for /dev/md1, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sda6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sda6" first.
:~# mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdc3[0]
86003840 blocks [3/1] [U__]
unused devices: <none>
:~# apport-bug mdadm
Here is a quick summary of what I did,
a) My disks were synced on an 11.10 system
b) I upgraded from 11.10 to 12.04 with one member failed (external)
c) After upgrade I failed the active disk (internal), stopped the array, and restarted it with the external disk
d) Attempted to re-add the failed internal disk after logging in
:~# blkid | grep raid_member
/dev/sda6: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
/dev/sdc3: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
:~# mdadm -D /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Array Size : 86003840 (82.02 GiB 88.07 GB)
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Sat Mar 3 13:56:05 2012
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Events : 0.10186827
Number Major Minor RaidDevice State
0 8 35 0 active sync /dev/sdc3
1 0 0 1 removed
2 0 0 2 removed
:~# mdadm -Q /dev/sdc3
/dev/sdc3: is not an md array
/dev/sdc3: device 0 in 3 device active raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6
/dev/sda6: is not an md array
/dev/sda6: device 1 in 3 device mismatch raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6 --examine
/dev/sda6:
Magic : a92b4efc
Version : 0.90.00
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Array Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Update Time : Sat Mar 3 13:28:57 2012
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 60f50ddb - correct
Events : 10128612
Number Major Minor RaidDevice State
this 1 8 6 1 active sync /dev/sda6
0 0 0 0 0 removed
1 1 8 6 1 active sync /dev/sda6
2 2 0 0 2 faulty removed
clearly it is not active (0,8,35,0 is per -D output above), but it
thinks it is.
Captured enough.. time to reboot and see what happens; Hopefully an
auto-rebuild. I have the third disk in the array separate should
some corruption happen here.
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.3-2ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-17.27-generic 3.2.6
Uname: Linux 3.2.0-17-generic x86_64
NonfreeKernelModules: fglrx
ApportVersion: 1.94-0ubuntu1
Architecture: amd64
Date: Sat Mar 3 13:33:11 2012
MDadmExamine.dev.sda:
/dev/sda:
MBR Magic : aa55
Partition[0] : 121660182 sectors at 63 (type 07)
Partition[1] : 503477100 sectors at 121660245 (type 05)
MDadmExamine.dev.sda2:
/dev/sda2:
MBR Magic : aa55
Partition[0] : 78124032 sectors at 63 (type 83)
Partition[1] : 172007893 sectors at 78124095 (type 05)
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda5.
MDadmExamine.dev.sda7: Error: command ['/sbin/mdadm', '-E', '/dev/sda7'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda7.
MDadmExamine.dev.sdb: Error: command ['/sbin/mdadm', '-E', '/dev/sdb'] failed with exit code 1: mdadm: cannot open /dev/sdb: No medium found
MDadmExamine.dev.sdc:
/dev/sdc:
MBR Magic : aa55
Partition[0] : 104438502 sectors at 63 (type 83)
Partition[1] : 20498940 sectors at 104438565 (type 0b)
Partition[2] : 172007893 sectors at 124937505 (type fd)
MDadmExamine.dev.sdc1: Error: command ['/sbin/mdadm', '-E', '/dev/sdc1'] failed with exit code 1: mdadm: No md superblock detected on /dev/sdc1.
MDadmExamine.dev.sdc2:
/dev/sdc2:
MBR Magic : aa55
MachineType: Hewlett-Packard HP Pavilion dv5 Notebook PC
ProcEnviron:
LANGUAGE=en
TERM=xterm
LANG=en_US.utf8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-17-generic root=UUID=10f8a2ac-5ab7-43a2-bdf8-92eee349e09d ro quiet splash vt.handoff=7
ProcMDstat:
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdc3[0]
86003840 blocks [3/1] [U__]
unused devices: <none>
SourcePackage: mdadm
UpgradeStatus: Upgraded to precise on 2012-03-03 (0 days ago)
dmi.bios.date: 08/19/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.37
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 30F2
dmi.board.vendor: Quanta
dmi.board.version: 98.36
dmi.chassis.type: 10
dmi.chassis.vendor: Quanta
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.37:bd08/19/2009:svnHewlett-Packard:pnHPPaviliondv5NotebookPC:pvrRev1:rvnQuanta:rn30F2:rvr98.36:cvnQuanta:ct10:cvrN/A:
dmi.product.name: HP Pavilion dv5 Notebook PC
dmi.product.version: Rev 1
dmi.sys.vendor: Hewlett-Packard
mtime.conffile..etc.udev.rules.d.85.mdadm.rules: 2009-01-02T11:08:01
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/945786/+subscriptions
More information about the foundations-bugs
mailing list