Corrupted ext3 filesystem on top of Adaptec RAID5 array

Michael R. Head burner at suppressingfire.org
Thu Jul 13 03:43:27 UTC 2006


Apparently, the machine was also drawing a lot of power such that the
UPS was in overload mode sounding alarms. Perhaps some process was
running that caused the machine (and the array) to overheat, or maybe 5
SATA-II drives are too much for the box...

On Wed, 2006-07-12 at 11:20 -0800, Michael R. Head wrote:
> On Tue, 2006-07-11 at 13:01 +0200, Tchize wrote:
> > Hi Michael,
> > 
> > am not an expert at such linux filesystems behaviour, but your error is 
> > very surprising. If you were not on top of a RAID-5, i would bet for a 
> > hard drive failure...
> 
> Right. It's very surprising.
> 
> > Please do the following check:
> > 
> > - if it's hardware raid, use the bios to see the RAID logs for any 
> > message about dead disks.
> 
> I'm 4000 miles away from the machine at the moment, so I won't be able
> to check it myself until I get back on Monday...
> 
> > - if it's software raid, use the linux tools to see status of the RAID
> > - take also a look at the various disk's S.M.A.R.T reports
> > 
> > smartctl -H /dev/hdx
> > smartctl -l error /dev/hdx
> 
> All the physical drives are hidden behind the RAID driver, so I only
> have the logical /dev/sda drive. Plus, I don't have the SMART tools
> installed.
> 
> > Michael R. Head wrote:
> > > Woke up this morning to find one of my dapper servers in the midst of a
> > > crisis. The filesystem had been remounted readonly and large parts are
> > > corrupted. The machine is fairly new, and a backup regime has not been
> > > put in place yet.
> > >
> > > I'm curious about a few things:
> > >
> > > 1) How did this happen? The logs appear to have been destroyed along
> > > with most any diagnostic information. (syslog is zeroed out)
> > > 2) What's the best way to recover?
> > > 3) Has anyone seen something similar?
> > > 4) How can I prevent this in the future?
> > >
> > > Here are some symptoms:
> > > 1) The filesystem remounted read only due to some error I can't discern
> > > 2) several programs simply fail to run:
> > > burner at rhea:~$ mount
> > > -bash: /bin/mount: Input/output error
> > > burner at rhea:~$ lspci
> > > -bash: /usr/bin/lspci: Input/output error
> > > burner at rhea:~$ dmesg
> > > -bash: /bin/dmesg: Input/output error
> > > 3) /var/log/syslog was corrupted and its contents zeroed out:
> > > burner at rhea:~$ ls -l /var/log/syslog
> > > -rw-r----- 1 root adm 3699 2006-07-10 14:10 /var/log/syslog
> > > burner at rhea:~$ less /var/log/syslog
> > > "/var/log/syslog" may be a binary file.  See it anyway?
> > > ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
> > > 4) Many files are in a bizarre state:
> > > burner at rhea:~$ ls -l /usr/bin |grep -v rw |head
> > > total 162456
> > > ?---------  ? ?      ?             ?                ? /usr/bin/411toppm
> > > ?---------  ? ?      ?             ?                ? /usr/bin/a2p
> > > ?---------  ? ?      ?             ?                ? /usr/bin/a2ping
> > > ?---------  ? ?      ?             ?                ? /usr/bin/afm2tfm
> > > ?---------  ? ?      ?             ?                ? /usr/bin/afmdiff.awk
> > > ?---------  ? ?      ?             ?                ? /usr/bin/allcm
> > > ?---------  ? ?      ?             ?                ? /usr/bin/allneeded
> > > ?---------  ? ?      ?             ?                ? /usr/bin/anytopnm
> > > ?---------  ? ?      ?             ?                ? /usr/bin/asciitopgm
> > >
> > > I'm baffled, but I do have a possible cause.
> > > Recently I resized (shrunk) the root filesystem to make more room for
> > > swap. For this, I used the desktop CD with gparted to shrink the
> > > filesystem and partition for '/' and increase the swap partition.
> > > Is it possible that this caused the failure?
> > >
> > > mike
> > >   
> > 
> > 
> -- 
> Michael R. Head <burner at suppressingfire.org>
> suppressingfire.org
> 
> 
-- 
Michael R. Head <burner at suppressingfire.org>
suppressingfire.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20060712/eabee54d/attachment.sig>


More information about the ubuntu-users mailing list