RE: [PATCH] x86, mce: use mce_usable_address() for UCNA memory error recovery

From: Luck, Tony
Date: Mon Jan 05 2015 - 13:11:35 EST


> The IA32_MCi_ADDR MSR contains the address of the code or data memory
> location that produced the machine-check error. The IA32_MCi_ADDR
> register is either not implemented or contains no address if the ADDRV
> flag in the IA32_MCi_STATUS register is clear. The address returned is
> an offset into a segment, linear address, physical address, or memory
> address. This depends on the error encountered.
> -- Intel SDM Volume 3B

But SDM also says:

If both MISCV and IA32_MCG_CAP[24] are set, the IA32_MCi_MISC_MSR
is defined according to Figure 15-8 to support software recovery of
uncorrected errors (see Section 15.6):

So you should only look at the LSB/MODE bits in MCi_MISC on Intel processors
which have MCG_CAP[24] == 1 (handily saved in "mca_cfg.ser" in mce.c).

This was buried in the old code because the only caller of mce_usable_address() was:

if (severity == MCE_AO_SEVERITY && mce_usable_address(&m))

and we can only have AO_SEVERITY set on systems with MCG_CAP[24]==1.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/