Re: [PATCH v4 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

From: Thomas Gleixner
Date: Wed Mar 27 2019 - 15:20:05 EST




On Mon, 25 Mar 2019, Ghannam, Yazen wrote:

> From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
>
> AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA
> errors under certain conditions. The errors are benign and can safely be
> ignored. However, the high error rate may cause the MCA threshold
> counter to overflow causing a high rate of thresholding interrupts. In
> addition, users may see the errors reported through the AMD MCE decoder
> module, even with the interrupt disabled, due to MCA polling.
>
> This error is reported through the Instruction Fetch bank.
>
> Clear the "Counter Present" bit in the Instruction Fetch bank's
> MCA_MISC0 register. This will prevent enabling MCA thresholding on this
> bank which will prevent the high interrupt rate due to this error.
>
> Define an AMD-specific function to filter these errors from the MCE
> event pool.
>
> Rename filter function in EDAC/mce_amd to avoid a naming conflict.
>
> Cc: <stable@xxxxxxxxxxxxxxx> # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models

What is this supposed to tell us?

> Cc: <stable@xxxxxxxxxxxxxxx> # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
> Cc: <stable@xxxxxxxxxxxxxxx> # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in <asm/mce.h>
> Cc: <stable@xxxxxxxxxxxxxxx> # 5.0.x

Confused.

Thanks,

tglx