Re: [HW PROBLEM] Intel I7 MCE. Erratum or not?

From: Giangiacomo Mariotti
Date: Mon Dec 08 2008 - 07:37:58 EST


On Mon, Dec 8, 2008 at 1:04 PM, Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:
>
>> I'm not sure but is it make sense if SMI causes an error and dropped MCE
>> exception without clearing error record?
>
> I wouldn't expect an SMI causing an internal cache machine check. That
> really
> looks more like something that comes out of the initialization code. But
> I haven't see it on any Nehalem system so far.
>
> Also it might be really some problem with this particular CPU.
>
> -Andi
>
If there's any test and/or anything else I can do to help tracking
down what exactly causes the problem, just let me know. I've already
done many cpu+memory intensive tests(in Linux for example a tbench
with 256 threads and various kernel compilations) to see if the mce
appears during normal working time of the cpus, but nothing happened.
As I've already said in this thread, the mce appears only at a
particular moment after reboot. So I have exactly 1 mce log after
every reboot, always at [ 301.7320xx] and with mce=nobootlog no mce
gets logged. Windows Vista never gave me the BSOD, not even while
taking all the tests of 3D Mark Vantage(yes, even if I'm a grown up, I
still like VGs). The fact that windows never crashed should mean that
the mce was never triggered on that OS, could that be somewhat related
to the other problem I talked about here, i.e. that I cannot reboot on
Linux and I'm forced to halt the system everytime I want to reboot,
while on windows I don't have any similar problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/