[Bug 945786] Re: mdadm refuses to re-add failed member
iMac
945786 at bugs.launchpad.net
Sat Mar 3 19:04:25 UTC 2012
** Description changed:
- I have my /home in a RAID1 configuration (/dev/md1) with a partition on
- my laptop and a second on an external disk connected via eSATA; A third
- sits on a third external disk. I booted up with one member degraded
- (external drive not plugged in) and prior to login, proceeded to use a
- console to umount, remove and fail the active drive (internal partition
- member) and stop the RAID1 disk, and then plug in my external, re-
- starting the /dev/md1 device with the external partition member active
- and remounting /home. The process is one I have executed many times
- before and is scripted from a couple of files in /usr/local/bin.
+ I have my /home in a three-disk RAID1 configuration (/dev/md1) with a
+ partition on my laptop and a second on an external disk connected via
+ eSATA; A third sits on a third external disk. I booted up with two
+ members degraded (external drive not plugged in) and prior to login,
+ proceeded to use a console to umount, remove and fail the active drive
+ (internal partition member) and stop the RAID1 disk, and then plug in my
+ external, re-starting the /dev/md1 device with the external partition
+ member active and remounting /home. The process is one I have executed
+ many times before and is scripted from a couple of files in
+ /usr/local/bin.
However, this time after logging in with my external member active after
executing the process above, and attempting to re-add the internal drive
to bring the /dev/md1 device in sync with the external disk I received
an error suggesting the add failed. I re-executed the remove, fail, re-
add manually with the same outcome as shown on my console below, and
filed this bug.
It seems the failed disk thinks it is still active, when I use -Q
--examine to interrogate it.
:~# mdadm /dev/md1 -r /dev/sda6
mdadm: hot remove failed for /dev/sda6: No such device or address
:~# mdadm /dev/md1 -f /dev/sda6
mdadm: set device faulty failed for /dev/sda6: No such device
:~# mdadm /dev/md1 -a /dev/sda6
mdadm: /dev/sda6 reports being an active member for /dev/md1, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sda6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sda6" first.
- :~# mdstat
- Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
+ :~# mdstat
+ Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdc3[0]
- 86003840 blocks [3/1] [U__]
-
+ 86003840 blocks [3/1] [U__]
+
unused devices: <none>
:~# apport-bug mdadm
- Here is a quick summary of what I did,
+ Here is a quick summary of what I did,
a) My disks were synced on an 11.10 system
b) I upgraded from 11.10 to 12.04 with one member failed (external)
c) After upgrade I failed the active disk (internal), stopped the array, and restarted it with the external disk
d) Attempted to re-add the failed internal disk after logging in
:~# blkid | grep raid_member
- /dev/sda6: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
- /dev/sdc3: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
+ /dev/sda6: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
+ /dev/sdc3: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
:~# mdadm -D /dev/md1
/dev/md1:
- Version : 0.90
- Creation Time : Sun Jul 27 22:53:23 2008
- Raid Level : raid1
- Array Size : 86003840 (82.02 GiB 88.07 GB)
- Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
- Raid Devices : 3
- Total Devices : 1
+ Version : 0.90
+ Creation Time : Sun Jul 27 22:53:23 2008
+ Raid Level : raid1
+ Array Size : 86003840 (82.02 GiB 88.07 GB)
+ Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
+ Raid Devices : 3
+ Total Devices : 1
Preferred Minor : 1
- Persistence : Superblock is persistent
+ Persistence : Superblock is persistent
- Update Time : Sat Mar 3 13:56:05 2012
- State : clean, degraded
- Active Devices : 1
+ Update Time : Sat Mar 3 13:56:05 2012
+ State : clean, degraded
+ Active Devices : 1
Working Devices : 1
- Failed Devices : 0
- Spare Devices : 0
+ Failed Devices : 0
+ Spare Devices : 0
- UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
- Events : 0.10186827
+ UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
+ Events : 0.10186827
- Number Major Minor RaidDevice State
- 0 8 35 0 active sync /dev/sdc3
- 1 0 0 1 removed
- 2 0 0 2 removed
+ Number Major Minor RaidDevice State
+ 0 8 35 0 active sync /dev/sdc3
+ 1 0 0 1 removed
+ 2 0 0 2 removed
:~# mdadm -Q /dev/sdc3
/dev/sdc3: is not an md array
/dev/sdc3: device 0 in 3 device active raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6
/dev/sda6: is not an md array
/dev/sda6: device 1 in 3 device mismatch raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6 --examine
/dev/sda6:
- Magic : a92b4efc
- Version : 0.90.00
- UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
- Creation Time : Sun Jul 27 22:53:23 2008
- Raid Level : raid1
- Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
- Array Size : 86003840 (82.02 GiB 88.07 GB)
- Raid Devices : 3
- Total Devices : 1
+ Magic : a92b4efc
+ Version : 0.90.00
+ UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
+ Creation Time : Sun Jul 27 22:53:23 2008
+ Raid Level : raid1
+ Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
+ Array Size : 86003840 (82.02 GiB 88.07 GB)
+ Raid Devices : 3
+ Total Devices : 1
Preferred Minor : 1
- Update Time : Sat Mar 3 13:28:57 2012
- State : clean
- Active Devices : 1
+ Update Time : Sat Mar 3 13:28:57 2012
+ State : clean
+ Active Devices : 1
Working Devices : 1
- Failed Devices : 1
- Spare Devices : 0
- Checksum : 60f50ddb - correct
- Events : 10128612
+ Failed Devices : 1
+ Spare Devices : 0
+ Checksum : 60f50ddb - correct
+ Events : 10128612
-
- Number Major Minor RaidDevice State
+ Number Major Minor RaidDevice State
this 1 8 6 1 active sync /dev/sda6
- 0 0 0 0 0 removed
- 1 1 8 6 1 active sync /dev/sda6
- 2 2 0 0 2 faulty removed
+ 0 0 0 0 0 removed
+ 1 1 8 6 1 active sync /dev/sda6
+ 2 2 0 0 2 faulty removed
clearly it is not active (0,8,35,0 is per -D output above), but it
thinks it is.
Captured enough.. time to reboot and see what happens; Hopefully an
auto-rebuild. I have the third disk in the array separate should some
corruption happen here.
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.3-2ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-17.27-generic 3.2.6
Uname: Linux 3.2.0-17-generic x86_64
NonfreeKernelModules: fglrx
ApportVersion: 1.94-0ubuntu1
Architecture: amd64
Date: Sat Mar 3 13:33:11 2012
MDadmExamine.dev.sda:
- /dev/sda:
- MBR Magic : aa55
- Partition[0] : 121660182 sectors at 63 (type 07)
- Partition[1] : 503477100 sectors at 121660245 (type 05)
+ /dev/sda:
+ MBR Magic : aa55
+ Partition[0] : 121660182 sectors at 63 (type 07)
+ Partition[1] : 503477100 sectors at 121660245 (type 05)
MDadmExamine.dev.sda2:
- /dev/sda2:
- MBR Magic : aa55
- Partition[0] : 78124032 sectors at 63 (type 83)
- Partition[1] : 172007893 sectors at 78124095 (type 05)
+ /dev/sda2:
+ MBR Magic : aa55
+ Partition[0] : 78124032 sectors at 63 (type 83)
+ Partition[1] : 172007893 sectors at 78124095 (type 05)
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda5.
MDadmExamine.dev.sda7: Error: command ['/sbin/mdadm', '-E', '/dev/sda7'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda7.
MDadmExamine.dev.sdb: Error: command ['/sbin/mdadm', '-E', '/dev/sdb'] failed with exit code 1: mdadm: cannot open /dev/sdb: No medium found
MDadmExamine.dev.sdc:
- /dev/sdc:
- MBR Magic : aa55
- Partition[0] : 104438502 sectors at 63 (type 83)
- Partition[1] : 20498940 sectors at 104438565 (type 0b)
- Partition[2] : 172007893 sectors at 124937505 (type fd)
+ /dev/sdc:
+ MBR Magic : aa55
+ Partition[0] : 104438502 sectors at 63 (type 83)
+ Partition[1] : 20498940 sectors at 104438565 (type 0b)
+ Partition[2] : 172007893 sectors at 124937505 (type fd)
MDadmExamine.dev.sdc1: Error: command ['/sbin/mdadm', '-E', '/dev/sdc1'] failed with exit code 1: mdadm: No md superblock detected on /dev/sdc1.
MDadmExamine.dev.sdc2:
- /dev/sdc2:
- MBR Magic : aa55
+ /dev/sdc2:
+ MBR Magic : aa55
MachineType: Hewlett-Packard HP Pavilion dv5 Notebook PC
ProcEnviron:
- LANGUAGE=en
- TERM=xterm
- LANG=en_US.utf8
- SHELL=/bin/bash
+ LANGUAGE=en
+ TERM=xterm
+ LANG=en_US.utf8
+ SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-17-generic root=UUID=10f8a2ac-5ab7-43a2-bdf8-92eee349e09d ro quiet splash vt.handoff=7
ProcMDstat:
- Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
- md1 : active raid1 sdc3[0]
- 86003840 blocks [3/1] [U__]
-
- unused devices: <none>
+ Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
+ md1 : active raid1 sdc3[0]
+ 86003840 blocks [3/1] [U__]
+
+ unused devices: <none>
SourcePackage: mdadm
UpgradeStatus: Upgraded to precise on 2012-03-03 (0 days ago)
dmi.bios.date: 08/19/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.37
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 30F2
dmi.board.vendor: Quanta
dmi.board.version: 98.36
dmi.chassis.type: 10
dmi.chassis.vendor: Quanta
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.37:bd08/19/2009:svnHewlett-Packard:pnHPPaviliondv5NotebookPC:pvrRev1:rvnQuanta:rn30F2:rvr98.36:cvnQuanta:ct10:cvrN/A:
dmi.product.name: HP Pavilion dv5 Notebook PC
dmi.product.version: Rev 1
dmi.sys.vendor: Hewlett-Packard
mtime.conffile..etc.udev.rules.d.85.mdadm.rules: 2009-01-02T11:08:01
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/945786
Title:
mdadm refuses to re-add failed member
Status in “mdadm” package in Ubuntu:
New
Bug description:
I have my /home in a three-disk RAID1 configuration (/dev/md1) with a
partition on my laptop and a second on an external disk connected via
eSATA; A third sits on a third external disk. I booted up with two
members degraded (external drive not plugged in) and prior to login,
proceeded to use a console to umount, remove and fail the active
drive (internal partition member) and stop the RAID1 disk, and then
plug in my external, re-starting the /dev/md1 device with the external
partition member active and remounting /home. The process is one I
have executed many times before and is scripted from a couple of files
in /usr/local/bin.
However, this time after logging in with my external member active
after executing the process above, and attempting to re-add the
internal drive to bring the /dev/md1 device in sync with the external
disk I received an error suggesting the add failed. I re-executed the
remove, fail, re-add manually with the same outcome as shown on my
console below, and filed this bug.
It seems the failed disk thinks it is still active, when I use -Q
--examine to interrogate it.
:~# mdadm /dev/md1 -r /dev/sda6
mdadm: hot remove failed for /dev/sda6: No such device or address
:~# mdadm /dev/md1 -f /dev/sda6
mdadm: set device faulty failed for /dev/sda6: No such device
:~# mdadm /dev/md1 -a /dev/sda6
mdadm: /dev/sda6 reports being an active member for /dev/md1, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sda6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sda6" first.
:~# mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdc3[0]
86003840 blocks [3/1] [U__]
unused devices: <none>
:~# apport-bug mdadm
Here is a quick summary of what I did,
a) My disks were synced on an 11.10 system
b) I upgraded from 11.10 to 12.04 with one member failed (external)
c) After upgrade I failed the active disk (internal), stopped the array, and restarted it with the external disk
d) Attempted to re-add the failed internal disk after logging in
:~# blkid | grep raid_member
/dev/sda6: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
/dev/sdc3: UUID="eeeb6708-d108-0847-57e9-714c01b7dbc8" TYPE="linux_raid_member"
:~# mdadm -D /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Array Size : 86003840 (82.02 GiB 88.07 GB)
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Sat Mar 3 13:56:05 2012
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Events : 0.10186827
Number Major Minor RaidDevice State
0 8 35 0 active sync /dev/sdc3
1 0 0 1 removed
2 0 0 2 removed
:~# mdadm -Q /dev/sdc3
/dev/sdc3: is not an md array
/dev/sdc3: device 0 in 3 device active raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6
/dev/sda6: is not an md array
/dev/sda6: device 1 in 3 device mismatch raid1 /dev/md1. Use mdadm --examine for more detail.
:~# mdadm -Q /dev/sda6 --examine
/dev/sda6:
Magic : a92b4efc
Version : 0.90.00
UUID : eeeb6708:d1080847:57e9714c:01b7dbc8
Creation Time : Sun Jul 27 22:53:23 2008
Raid Level : raid1
Used Dev Size : 86003840 (82.02 GiB 88.07 GB)
Array Size : 86003840 (82.02 GiB 88.07 GB)
Raid Devices : 3
Total Devices : 1
Preferred Minor : 1
Update Time : Sat Mar 3 13:28:57 2012
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 60f50ddb - correct
Events : 10128612
Number Major Minor RaidDevice State
this 1 8 6 1 active sync /dev/sda6
0 0 0 0 0 removed
1 1 8 6 1 active sync /dev/sda6
2 2 0 0 2 faulty removed
clearly it is not active (0,8,35,0 is per -D output above), but it
thinks it is.
Captured enough.. time to reboot and see what happens; Hopefully an
auto-rebuild. I have the third disk in the array separate should
some corruption happen here.
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.3-2ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-17.27-generic 3.2.6
Uname: Linux 3.2.0-17-generic x86_64
NonfreeKernelModules: fglrx
ApportVersion: 1.94-0ubuntu1
Architecture: amd64
Date: Sat Mar 3 13:33:11 2012
MDadmExamine.dev.sda:
/dev/sda:
MBR Magic : aa55
Partition[0] : 121660182 sectors at 63 (type 07)
Partition[1] : 503477100 sectors at 121660245 (type 05)
MDadmExamine.dev.sda2:
/dev/sda2:
MBR Magic : aa55
Partition[0] : 78124032 sectors at 63 (type 83)
Partition[1] : 172007893 sectors at 78124095 (type 05)
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda5.
MDadmExamine.dev.sda7: Error: command ['/sbin/mdadm', '-E', '/dev/sda7'] failed with exit code 1: mdadm: No md superblock detected on /dev/sda7.
MDadmExamine.dev.sdb: Error: command ['/sbin/mdadm', '-E', '/dev/sdb'] failed with exit code 1: mdadm: cannot open /dev/sdb: No medium found
MDadmExamine.dev.sdc:
/dev/sdc:
MBR Magic : aa55
Partition[0] : 104438502 sectors at 63 (type 83)
Partition[1] : 20498940 sectors at 104438565 (type 0b)
Partition[2] : 172007893 sectors at 124937505 (type fd)
MDadmExamine.dev.sdc1: Error: command ['/sbin/mdadm', '-E', '/dev/sdc1'] failed with exit code 1: mdadm: No md superblock detected on /dev/sdc1.
MDadmExamine.dev.sdc2:
/dev/sdc2:
MBR Magic : aa55
MachineType: Hewlett-Packard HP Pavilion dv5 Notebook PC
ProcEnviron:
LANGUAGE=en
TERM=xterm
LANG=en_US.utf8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-17-generic root=UUID=10f8a2ac-5ab7-43a2-bdf8-92eee349e09d ro quiet splash vt.handoff=7
ProcMDstat:
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdc3[0]
86003840 blocks [3/1] [U__]
unused devices: <none>
SourcePackage: mdadm
UpgradeStatus: Upgraded to precise on 2012-03-03 (0 days ago)
dmi.bios.date: 08/19/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.37
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 30F2
dmi.board.vendor: Quanta
dmi.board.version: 98.36
dmi.chassis.type: 10
dmi.chassis.vendor: Quanta
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.37:bd08/19/2009:svnHewlett-Packard:pnHPPaviliondv5NotebookPC:pvrRev1:rvnQuanta:rn30F2:rvr98.36:cvnQuanta:ct10:cvrN/A:
dmi.product.name: HP Pavilion dv5 Notebook PC
dmi.product.version: Rev 1
dmi.sys.vendor: Hewlett-Packard
mtime.conffile..etc.udev.rules.d.85.mdadm.rules: 2009-01-02T11:08:01
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/945786/+subscriptions
More information about the foundations-bugs
mailing list