Re: [HW PROBLEM] Intel I7 MCE. Erratum or not?

From: Giangiacomo Mariotti
Date: Mon Dec 08 2008 - 01:48:24 EST


I noticed something else, which though may be due to my inexperience
with mce messages. In my directory /sys/devices/system/machinecheck
there are machinecheck0-7(one for each logical cpu of my system I
presume). Having received the MCE log always for cpu 0, I went to look
inside dir machinecheck0 and I found bank0-5ctl. So now my question
is, why do I receive MCE logs about bank 6, if my cpus don't have a
bank 6? Does that count start from 1? Or am I missing something else?

Log of MCEs(They all happended once for each reboot):

"Boot 0"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 202d ADDR ffeef740
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Data-Write Error
STATUS ee0000000100014a MCGSTATUS 0

"Boot 1"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 308 ADDR ffefac00
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Read Error
STATUS ee0000000100011a MCGSTATUS 0

"Boot 2"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 212d ADDR ffef77c0
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Data-Write Error
STATUS ee0000000100014a MCGSTATUS 0

"Boot 3"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 202d ADDR ffee0280
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Data-Write Error
STATUS ee0000000100014a MCGSTATUS 0

"Boot 4"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 202d ADDR ffef5cc0
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Data-Write Error
STATUS ee0000000100014a MCGSTATUS 0

"Boot 5"
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 6 MISC 212d ADDR ffef2f40
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Generic CACHE Level-2 Data-Write Error
STATUS ee0000000100014a MCGSTATUS 0

Thanks for the help,

Giangiacomo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/