[Bug 1014848] Re: Systems sometimes come up with RAID devices dropped, re-adding fails.
Dmitrijs Ledkovs
1014848 at bugs.launchpad.net
Mon Jun 18 22:32:36 UTC 2012
(launchpad email interface is failing me)
Thank you for filing a bug report, I was not subscribed to the launchpad
answers. I will now.
First things, first. There are are couple of issues you are outlining
here and I am going to mark this bug as invalid, on the basis that is it
a support type of question and not a real bug report. In any case if
some of these are real bugs they will need to be split into separate bug
reports.
Please take some time to read https://wiki.ubuntu.com/ReliableRaid
I am working on improving the situation.
Here we go.
On 18/06/12 22:28, Iordan Iordanov wrote:
> Public bug reported:
>
> We have 3 systems with RAID1 sets which sometimes come up with one
> device missing. Attempting to re-add the device fails with:
>
This is probably due to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/942106
Does adding settle help?
> # mdadm /dev/md2 --re-add /dev/sda6
> mdadm: --re-add for /dev/sda6 to /dev/md2 is not possible
>
> # mdadm /dev/md2 --add /dev/sda6
> mdadm: /dev/sda6 reports being an active member for /dev/md2, but a --re-add fails.
> mdadm: not performing --add as that would convert /dev/sda6 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sda6" first.
>
This seems odd, did md2 got brought up degraded? So I would think this
should be possible. I will add this to my testing suite, which is yet to
be written.
> The only thing which "resolves" the problem is zeroing the superblock on
> the drive, and adding it to the RAID array. This causes an unnecessary
> rebuild of the drive, as it shouldn't have been dropped from the array
> in the first place.
>
Hmmm... I think I so a bug or two about requiring to zero out
superblock. Can you please search the bugs on launchpad?
> A similar, perhaps related problem occurred on a RAID6 system with 6
> devices. We failed/removed one device, and moved it to another slot on
> the system (probably irrelevant). Trying to add it back into the RAID6
> array resulted in the same error. This has happened with the newest
> kernel (3.2.0-25) as well as with 3.2.0-24 which was used to create this
> bug-report. It has also happened with different RAID levels, which is
> why I filed the bug against mdadm, rather than the kernel.
>
Are the disks regular SATA attached? Or something special like iSCSI?
I bet the superblock metadata was still old accross the device and there
may have been a dangling link somewhere.
I will digg through the collected debug info. And maybe respond with a
few more details.
> A question with more details was posted by a colleague of mine here:
> https://answers.launchpad.net/ubuntu/+source/mdadm/+question/199471
>
I will look into that.
> A question was posted by me in the linux-raid mailing list here:
> http://marc.info/?t=133953601900006&r=1&w=2
>
I will look, but in general I currently do not have the capacity to
follow linux-raild mailing list.
Regards,
Dmitrijs.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/1014848
Title:
Systems sometimes come up with RAID devices dropped, re-adding fails.
Status in “mdadm” package in Ubuntu:
Invalid
Bug description:
We have 3 systems with RAID1 sets which sometimes come up with one
device missing. Attempting to re-add the device fails with:
# mdadm /dev/md2 --re-add /dev/sda6
mdadm: --re-add for /dev/sda6 to /dev/md2 is not possible
# mdadm /dev/md2 --add /dev/sda6
mdadm: /dev/sda6 reports being an active member for /dev/md2, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sda6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sda6" first.
The only thing which "resolves" the problem is zeroing the superblock
on the drive, and adding it to the RAID array. This causes an
unnecessary rebuild of the drive, as it shouldn't have been dropped
from the array in the first place.
A similar, perhaps related problem occurred on a RAID6 system with 6
devices. We failed/removed one device, and moved it to another slot on
the system (probably irrelevant). Trying to add it back into the RAID6
array resulted in the same error. This has happened with the newest
kernel (3.2.0-25) as well as with 3.2.0-24 which was used to create
this bug-report. It has also happened with different RAID levels,
which is why I filed the bug against mdadm, rather than the kernel.
A question with more details was posted by a colleague of mine here:
https://answers.launchpad.net/ubuntu/+source/mdadm/+question/199471
A question was posted by me in the linux-raid mailing list here:
http://marc.info/?t=133953601900006&r=1&w=2
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.3-2ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-24.39-generic-pae 3.2.16
Uname: Linux 3.2.0-24-generic-pae i686
ApportVersion: 2.0.1-0ubuntu8
Architecture: i386
Date: Mon Jun 18 17:12:16 2012
Lsusb:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 04b4:6560 Cypress Semiconductor Corp. CY7C65640 USB-2.0 "TetraHub"
MDadmExamine.dev.sda: Error: command ['/sbin/mdadm', '-E', '/dev/sda'] failed with exit code 1: mdadm: cannot open /dev/sda: Permission denied
MDadmExamine.dev.sda1: Error: command ['/sbin/mdadm', '-E', '/dev/sda1'] failed with exit code 1: mdadm: cannot open /dev/sda1: Permission denied
MDadmExamine.dev.sda2: Error: command ['/sbin/mdadm', '-E', '/dev/sda2'] failed with exit code 1: mdadm: cannot open /dev/sda2: Permission denied
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed with exit code 1: mdadm: cannot open /dev/sda5: Permission denied
MDadmExamine.dev.sda6: Error: command ['/sbin/mdadm', '-E', '/dev/sda6'] failed with exit code 1: mdadm: cannot open /dev/sda6: Permission denied
MDadmExamine.dev.sda7: Error: command ['/sbin/mdadm', '-E', '/dev/sda7'] failed with exit code 1: mdadm: cannot open /dev/sda7: Permission denied
MDadmExamine.dev.sda8: Error: command ['/sbin/mdadm', '-E', '/dev/sda8'] failed with exit code 1: mdadm: cannot open /dev/sda8: Permission denied
MDadmExamine.dev.sda9: Error: command ['/sbin/mdadm', '-E', '/dev/sda9'] failed with exit code 1: mdadm: cannot open /dev/sda9: Permission denied
MDadmExamine.dev.sdb: Error: command ['/sbin/mdadm', '-E', '/dev/sdb'] failed with exit code 1: mdadm: cannot open /dev/sdb: Permission denied
MDadmExamine.dev.sdb1: Error: command ['/sbin/mdadm', '-E', '/dev/sdb1'] failed with exit code 1: mdadm: cannot open /dev/sdb1: Permission denied
MachineType: Dell Computer Corporation PowerEdge 860
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
SHELL=/bin/bash
ProcKernelCmdLine: root=/dev/sda1 ro
ProcMDstat:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda9[2] sdb1[0]
409598840 blocks super 1.2 [2/2] [UU]
unused devices: <none>
SourcePackage: mdadm
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/04/2007
dmi.bios.vendor: Dell Computer Corporation
dmi.bios.version: A03
dmi.board.name: 0XM089
dmi.board.vendor: Dell Computer Corporation
dmi.board.version: A00
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Computer Corporation
dmi.modalias: dmi:bvnDellComputerCorporation:bvrA03:bd04/04/2007:svnDellComputerCorporation:pnPowerEdge860:pvr:rvnDellComputerCorporation:rn0XM089:rvrA00:cvnDellComputerCorporation:ct23:cvr:
dmi.product.name: PowerEdge 860
dmi.sys.vendor: Dell Computer Corporation
etc.blkid.tab:
<device DEVNO="0x0801" TIME="1339993953.873769" UUID="153fb914-13c1-4ab2-9368-be196bbe589a" TYPE="ext4">/dev/sda1</device>
<device DEVNO="0x0806" TIME="1339993967.110129" UUID="af709399-931c-4652-9565-5f53ccf8b5ce" TYPE="ext4">/dev/sda6</device>
<device DEVNO="0x0900" TIME="1339994011.818058" UUID="e00278de-892f-4477-bf35-385af3725b28" TYPE="ext4">/dev/md0</device>
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1014848/+subscriptions
More information about the foundations-bugs
mailing list