Machine Check Exception

Daniel J Blueman daniel.blueman at gmail.com
Tue May 11 10:10:15 UTC 2010


On Tue, May 11, 2010 at 6:45 AM, Kaushal Shriyan
<kaushalshriyan at gmail.com> wrote:
> On Fri, May 7, 2010 at 10:24 PM, Kaushal Shriyan
> <kaushalshriyan at gmail.com> wrote:
>> Hi,
>>
>> I get the below information on the console of Ubuntu Linux 8.04
>> server. I did some basic troubleshooting by replacing RAM but did not
>> help.Also I do
>> not know how to run mce-log on this host. Please suggest/guide me the
>> further steps to be taken to fix this issue.
>>
>> Thanks and Regards,
>>
>> Kaushal
>>
>> #########################################################################################################################
>> ata 5.00 : exception Emask 0x1 SAct 0x0 SErr 0X0 action 0x0
>> ata 5.00 : CPB resp_flags 0x11 :, CMD error
>> ata 5.00 : cmd c8/00:88:6f:f2:01/00:00:00:00:00/e5 tag0 dma 69632
>> ata 5.00 : status : {DRDY ERR}
>> ata 5.00 : error : {UNC}
>>
>> HARDWARE ERROR
>>
>> CPU 1 : MACHINE CHECK EXCEPTION
>> 4 BANK 4 : b200000000070f0f
>> TSC 24adf1489e
>>
>> This is not a software problem !
>> Run through mce-log ascii to decode and contact your hardware vendor
>>
>> HARDWARE ERROR
>>
>> CPU 0 : MACHINE CHECK EXCEPTION
>> 4 BANK 4 : b200000000070f0f
>> TSC 24adf15158
>>
>> This is not a software problem !
>> Run through mce-log ascii to decode and contact your hardware vendor
>>
>> Kernel Panic - not syncing Machine check
>> #########################################################################################################################
>>
>
> Hi again,
>
> I have replaced all the RAM Chips with the new set and It worked fine
> for sometime and then when i start mysql server the system spew out
> the Machine Check Exception again
>
> I have tried to decode the MCE on  a test machine using the below method.
>
> #cat error
> CPU 1 : MACHINE CHECK EXCEPTION 4 BANK 4 : b200000000070f0f
> TSC 24adf1489e
> # /usr/sbin/mcelog --k8 --ascii < error
> HARDWARE ERROR. This is *NOT* a software problem!
> Please contact your hardware vendor
> CPU 1 0 data cache TSC 24adf1489e
> STATUS 0 MCGSTATUS 0
> #cat error
> CPU 0 : MACHINE CHECK EXCEPTION 4 BANK 4 : b200000000070f0f
> TSC 24adf15158
> # /usr/sbin/mcelog --k8 --ascii < error
> HARDWARE ERROR. This is *NOT* a software problem!
> Please contact your hardware vendor
> CPU 0 0 data cache TSC 24adf15158
> STATUS 0 MCGSTATUS 0
>
> I am not sure what those error means. Please suggest the further steps.
>
> Thanks,
>
> Kaushal

Kaushal, this isn't the right place to post for hardware help, so
please post to the right group.

Some tips for now:

It's worthwhile loading BIOS defaults, and checking the processor
and/or chipset chips aren't getting too hot (eg thermal compound under
heatsinks dried up, sufficient case exhaust airflow), possibly
updating the BIOS.

If the problem still persists - it's most likely a motherboard issue;
RMA it if in warranty.

Thanks,
  Daniel
-- 
Daniel J Blueman




More information about the kernel-team mailing list