[Bug 13046] Kernel lockup when stressing s/w raid on Promise TX2plus SATA cards

bugzilla-daemon at bugzilla.ubuntu.com bugzilla-daemon at bugzilla.ubuntu.com
Thu Jul 28 13:51:50 UTC 2005


Please do not reply to this email.  You can add comments at
http://bugzilla.ubuntu.com/show_bug.cgi?id=13046
Ubuntu | linux


fabbione at ubuntu.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEEDINFO




------- Additional Comments From fabbione at ubuntu.com  2005-07-28 14:51 UTC -------
(In reply to comment #0)
> Hoary install from shipit CD.
> 
> The console displays -
> ata7: command timeout
> and the system then freezes solid. No response on ethernet and hitting caps
lock does not toggle 
> the LED.
> 
> /var/log/syslog contains loads of errors of the type -
> Jul 25 17:33:42 localhost kernel: ata7: status=0x51 { DriveReady SeekComplete
Error }
> Jul 25 17:33:42 localhost kernel: ata7: error=0x0c { DriveStatusError }
> 
> It's not always ata7 - sometimes I get ata6, or ata8. These errors occur just
after the s/w raid 
> arrays are mounted and then if I stress the array with a large copy or with
dd. If I stress the 
> array even more with other operations at the same time then I get the console
message and the 
> lockup.
> 
> I've tried acpi=off
> I've tried turning on P&P OS in the BIOS.
> I've tried the 2.6.12 kernel from Breezy.
> I've tried removing smartd and hddtemp (to see if libata pass through might be
an issue)
> I still get exactly the same errors and lockup.
> 
> Slightly similar report here -
http://www.uwsg.iu.edu/hypermail/linux/kernel/0507.1/0651.html
> 
> Hardware:
> Asus A7V600 mobo with a Athlon 2000+ processor, 512MB of RAM
> 4x Maxtor PDC20375 SATA cards (rebranded Promise TX2plus) in PCI slots.
> Each drive is cabled to 2 SATA HDs and these are used as two groups of four
for software raid 
> using mdadm. Boot drive is a non-raid PATA drive.
> Four of the drives are ST3250823AS - these are OK
> The other four are ST3300831AS - these are the ones that give errors.
> Only four of the drives are seen at BIOS boot time but I'm not sure which
ones. Linux spots all 
> 8 drives and the raid arrays work fine unless stressed.
>

If you do NOT have important data on the RAID, i would try to switch the
harddisks around to isolate
the problem of controller/disk combination. Generally the errors you see are
related either to
a bad set of disks or to bad cables.
There are also other reasons why that could happen, like the system doesn't
provide enough power to
keep the the disks in a proper state on heavy load...

You could also attempt to reproduce the problem with one disk only connected to
the problematic controller..

there are several combinations of tests to do :(

Fabio

-- 
Configure bugmail: http://bugzilla.ubuntu.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.




More information about the kernel-bugs mailing list