Re: [HW PROBLEM] Intel I7 MCE. Erratum or not?

From: Giangiacomo Mariotti
Date: Sat Dec 06 2008 - 16:43:40 EST


On Sat, Dec 6, 2008 at 9:58 PM, Robert Hancock <hancockr@xxxxxxx> wrote:
> Giangiacomo Mariotti wrote:
>>
>> Hi everyone,
>> Mcelog just logged on my new Intel I7 920 (on Linux 2.6.27.8) this :
>> MCE 0
>> HARDWARE ERROR. This is *NOT* a software problem!
>> Please contact your hardware vendor
>> CPU 0 BANK 6 MISC 202d ADDR ffeef740
>> MCG status:
>> MCi status:
>> Error overflow
>> Uncorrected error
>> MCi_MISC register valid
>> MCi_ADDR register valid
>> Processor context corrupt
>> MCA: Generic CACHE Level-2 Data-Write Error
>> STATUS ee0000000100014a MCGSTATUS 0
>>
>> I'm reporting this here, because I found in the Intel I7 Technical
>> Specification November 2008 update that something which seems very
>> similar is in fact an erratum. So my question is : Is there any way
>> for me to verify that my problem is due to one of those errata,instead
>> of a broken hardware(if we don't want to consider all those errata as
>> broken hardware)? I'm also reporting this because I thought it may be
>> useful to signal that(if actually due to those errata) these problems
>> actually occur, so it may be useful to find workarounds in the kernel
>> to not scare to death poor Linux users!
>
> Which erratum are you talking about? I don't see one in that document that
> would match this case..
>
Well, the first one seems very similar, even if it talks about a dtlb
error instead of cache error. But sure,being similar doesn't mean too
much. Number 52 seems similar too. I guess I should just give up and
admit that my hardware is broken!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/