System Load Extreme and Non Responsive

Xandros Pilosa folivora.pilosa at gmail.com
Thu Nov 26 07:44:39 UTC 2009


Dne 25.11.2009 (sre) ob 11:42 -0500 je Keith Clark zapisal(a):
> My system seems to become unresponsive every about half hour and the
> Load goes through the roof.  CPU usage does not seem to be high, nor
> network usage during this period.
> 
> I looked at my syslog for that time period and I see the following block
> of entries over and over again:
> 

Hello Keith,
following comments are from [1]

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533198] ata1.00:
> exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0

Emask == ATA command's internal error mask (AC_ERR_xxx in source code)
SErr == SATA SError register action

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533208] ata1.00:
> irq_stat 0x40000008
> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533226] ata1.00: cmd
> 60/f8:08:89:22:8b/00:00:1b:00:00/40 tag 1 ncq 126976 in

The "cmd" line gives the ATA command (taskfile) sent to the device

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533230]          res
> 41/40:00:22:23:8b/e7:00:1b:00:00/40 Emask 0x409 (media error) <F>

media error == Software detected a media error

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533260] ata1.00:
> status: { DRDY ERR }

DRDY ERR == Device ready error

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.533264] ata1.00:
> error: { UNC }

UNC == Uncorrectable error - often due to bad sectors on the disk

> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.534164] ata1.00:
> SB600 AHCI: limiting to 255 sectors per cmd
> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.535146] ata1.00:
> SB600 AHCI: limiting to 255 sectors per cmd
> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.535152] ata1.00:
> configured for UDMA/133
> Nov 25 11:28:57 bookworm-acerdesktop kernel: [ 9440.535171] ata1: EH
> complete
> 
> Is this evidence of a drive error?  

I would say yes, at least very probably.

> It seems to have begun after a power
> failure and only happens to one user, the one who was logged in at the
> time of the power failure.  The other user does not experience this same
> issue.
> 
> Thanks,
> 
> Keith

One explanation would be, that bad sectors are on a part of the disk,
where /home/affected_user resides, possibly due to hd r/w on this
section of hd during power failure. Anyway, backup (I suppose you
already did that) and then "badblocks" and/or other hd tools would be in
order.

[1] http://ata.wiki.kernel.org/index.php/Libata_error_messages

Regards





More information about the ubuntu-users mailing list