Machine Check Exception

Kaushal Shriyan kaushalshriyan at gmail.com
Tue May 11 05:45:00 UTC 2010


On Fri, May 7, 2010 at 10:24 PM, Kaushal Shriyan
<kaushalshriyan at gmail.com> wrote:
> Hi,
>
> I get the below information on the console of Ubuntu Linux 8.04
> server. I did some basic troubleshooting by replacing RAM but did not
> help.Also I do
> not know how to run mce-log on this host. Please suggest/guide me the
> further steps to be taken to fix this issue.
>
> Thanks and Regards,
>
> Kaushal
>
> #########################################################################################################################
> ata 5.00 : exception Emask 0x1 SAct 0x0 SErr 0X0 action 0x0
> ata 5.00 : CPB resp_flags 0x11 :, CMD error
> ata 5.00 : cmd c8/00:88:6f:f2:01/00:00:00:00:00/e5 tag0 dma 69632
> ata 5.00 : status : {DRDY ERR}
> ata 5.00 : error : {UNC}
>
> HARDWARE ERROR
>
> CPU 1 : MACHINE CHECK EXCEPTION
> 4 BANK 4 : b200000000070f0f
> TSC 24adf1489e
>
> This is not a software problem !
> Run through mce-log ascii to decode and contact your hardware vendor
>
> HARDWARE ERROR
>
> CPU 0 : MACHINE CHECK EXCEPTION
> 4 BANK 4 : b200000000070f0f
> TSC 24adf15158
>
> This is not a software problem !
> Run through mce-log ascii to decode and contact your hardware vendor
>
> Kernel Panic - not syncing Machine check
> #########################################################################################################################
>

Hi again,

I have replaced all the RAM Chips with the new set and It worked fine
for sometime and then when i start mysql server the system spew out
the Machine Check Exception again

I have tried to decode the MCE on  a test machine using the below method.

#cat error
CPU 1 : MACHINE CHECK EXCEPTION 4 BANK 4 : b200000000070f0f
TSC 24adf1489e
# /usr/sbin/mcelog --k8 --ascii < error
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 0 data cache TSC 24adf1489e
STATUS 0 MCGSTATUS 0
#cat error
CPU 0 : MACHINE CHECK EXCEPTION 4 BANK 4 : b200000000070f0f
TSC 24adf15158
# /usr/sbin/mcelog --k8 --ascii < error
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 0 data cache TSC 24adf15158
STATUS 0 MCGSTATUS 0

I am not sure what those error means. Please suggest the further steps.

Thanks,

Kaushal




More information about the kernel-team mailing list