MD RAID-1 deadlocks ?

Anders Karlsson trudheim at gmail.com
Wed Dec 7 13:37:56 UTC 2005


Hi,

Apologies for cross-posting, but I believe it is relevant to all three lists.

I have for several weeks now had problems with my system at home,
Ubuntu Breezy install (installed from scratch), runs a few services,
nothing spectacular. I have had the server lock solid, sometimes it
would lock within seconds of it having finished booting and I kicked
of a kernel compile again.

A common theme is that as soon as I kick off a kernel compile, the box
will lock up shortly (1-15 min) later. All interrupts lost, kbd dead,
can't ssh in. Alt-SysRq doesn't work, can't break in to KDB. There is
never anything in the logs to give any hint about anything.

I have in turn tried:
 - running memcheck (came back clean, will run again overnight tonight)
 - taking out disks that is not used (at the moment)
 - tried kernel 2.6.14 (took config from Ubuntu's 2.6.12 and went from there)
 - tried kernel compiled with gcc 4.0 and with gcc 3.4, no difference
 - tried with 2.6.14.3, kdb 4.4 and serial console and all available
debug options (no Oops anywhere, just a straight locked up box)
 - currently using 2.6.15-rc4 with kdb4.4, most things compiled in to kernel
 - reseated CPU and heatsink, applied new thermal paste
 - move PCI cards (in case of 'IRQ clash')
 - change AGP gfx card (from nVidia to an ATI)
 - have run cpuburn (burnK7 and burnMMX), no problem

Yet to test:
 - bonnie++


The system has the following installed at the moment:
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600
AGP] Host Bridge (rev 80)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
0000:00:09.0 Ethernet controller: 3Com Corporation 3c940
10/100/1000Base-T [Marvell] (rev 12)
0000:00:0a.0 RAID bus controller: Silicon Image, Inc. (formerly CMD
Technology Inc) PCI0680 Ultra ATA-133 Host Controller (rev 02)
0000:00:0d.0 SCSI storage controller: LSI Logic / Symbios Logic
53C896/897 (rev 07)
0000:00:0d.1 SCSI storage controller: LSI Logic / Symbios Logic
53C896/897 (rev 07)
0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420
SATA RAID Controller (rev 80)
0000:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
0000:00:11.5 Multimedia audio controller: VIA Technologies, Inc.
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
0000:01:00.0 VGA compatible controller: ATI Technologies Inc RV280
[Radeon 9200 SE] (rev 01)
0000:01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon
9200 SE] (Secondary) (rev 01)

The storage is two PATA 80GB Maxtor disks, connected to separate
chains, both partitioned as two partitions, 2GB and 78GB. Both are
software mirrored, hda1+hdc1 makes md0 and hda2+hdc2 makes md1. '/'
was created on md0 and md1 was made into a PV and LVM was set up on
that.

I am not sure how much detail I should post here, but I can make
almost any details about the system available if it will help solve
this problem. It does seem to be connected to intense disk activity
though.

Any tips or pointers to what I can try next would be appreciated.

--
Anders Karlsson <trudheim at gmail.com>


More information about the ubuntu-users mailing list