RE: Help with machine check exception

From: Roger Heflin
Date: Thu Jan 12 2006 - 12:21:11 EST




> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx
> [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of
> Orion Poplawski
> Sent: Thursday, January 12, 2006 10:30 AM
> To: linux-kernel@xxxxxxxxxxxxxxx
> Subject: Help with machine check exception
>
> Can someone help determine the problem here? Does it
> definitely point to a bad CPU, or possibly a bad motherboard?
>
> Thanks!
>
> CPU 0: Machine Check Exception: 4 Bank 4:
> b200000000070f0f
> TSC 184fcd0553e4
> Kernel panic - not syncing: Machine check
>

If this is an Opteron, CPU or Memory, a dimm failing in the
correct manner will cause it, and I have seen a CPU cause it,
I don't know that I have seen a MB cause it, and we have fixed
a fair number of these errors. If it is memory, it can be any
of the dimms on that cpu.

I have seen this error kill a machine on boot up, but it looks
more like something was cleared improperly, and may only affect
much older versions of 2.6, in this case it is not broken hardware,
and rebooting will cause it to not be duplicatable.

Roger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/