Re: [boot crash] Re: [tip:x86/mce3] x86, mce: use 64bit machinecheck code on 32bit

From: Ingo Molnar
Date: Mon Aug 17 2009 - 05:18:46 EST



* Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> wrote:

> Ingo Molnar wrote:
> > * Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> wrote:
> >> Could you try boot your laptop with mce=nobootlog?
> >
> > Hm, why should that make any difference? mce=nobootlog only
> > influences whether we pass records into the mcelog buffer but
> > does not affect whether we touch the hardware.
>
> Old mce codes doesn't take bootlog.

I understand what you mean, and i know that we have a number of BIOS
workarounds in the code - but i think some of those workarounds are
wrong and they dont actually solve anything.

The thing is, mce=nobootlog does _not_ keep us from touching MCE
related hardware registers during bootup.

It only inhibits us from doing an mce_log() call:

if (!(flags & MCP_DONTLOG) && !mce_dont_log_ce) {
mce_log(&m);
add_taint(TAINT_MACHINE_CHECK);
}

but an mce_log() call itself only passes on the data we already read
from hardware registers, into the MCE ring-buffer (which is a pure
software construct).

> One possibility is: if the BIOS doesn't clear status in banks,
> new mce codes will try to log such junks.
>
> If the junk is totally junk but can be decoded as a valid log with
> MISCV or ADDRV bit, and if the cpu try to access register which is
> not implemented (e.g. IA32_MCi_MISC/ADDR), then such access might
> cause a general protection exception. (ref. ASDM 3A 15.3.2.3)
>
> I'm just guessing...

My point is that mce=nobootlog will only affect whether we call
mce_log(). It does not keep us from touching all the MSRs that
relate to MCEs.

mce=off does that, and the box boots up fine with that specified
(and as expected).

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/