Re: [HW PROBLEM] Intel I7 MCE. Erratum or not?

From: Hidetoshi Seto
Date: Mon Dec 08 2008 - 02:43:07 EST


Giangiacomo Mariotti wrote:
> I noticed something else, which though may be due to my inexperience
> with mce messages. In my directory /sys/devices/system/machinecheck
> there are machinecheck0-7(one for each logical cpu of my system I
> presume). Having received the MCE log always for cpu 0, I went to look
> inside dir machinecheck0 and I found bank0-5ctl. So now my question
> is, why do I receive MCE logs about bank 6, if my cpus don't have a
> bank 6? Does that count start from 1? Or am I missing something else?

Answer would be in the following commit:

> commit 8edc5cc5ec880c96de8e6686fb0d7a5231e91c05
> Author: Venki Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
> Date: Mon May 12 15:43:34 2008 +0200
>
> x86: remove 6 bank limitation in 64 bit MCE reporting code
(snip)
> The patch below does not create sysfs control (bankNctl) for banks
> higher than 6 as well. That needs some pre-cleanup in /sysfs mce layout,
> removal of per cpu /sysfs entries for bankctl as they are really global
> system level control today. That change will follow. This basic change
> is critical to report the detailed errors on banks higher than 6.

So there are 6 sysfs control(bank0-5ctl) even if your cpu have more banks.

Old kernel with bank limitation will say:
"MCE: warning: using only %d banks\n"
And it seems that old kernel will ignore records in banks higher than 6.

Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/