[PATCH v7 0/3] Handle corrected machine check interrupt storms

From: Tony Luck
Date: Tue Jul 18 2023 - 17:09:04 EST


Linux CMCI storm mitigation is a big hammer that just disables the CMCI
interrupt globally and switches to polling all banks.

There are two problems with this:
1) It really is a big hammer. It means that errors reported in other
banks from different functional units are all subject to the same
polling delay before being processed.
2) Intel systems signal some uncorrected errors using CMCI (e.g.
memory controller patrol scrub on Icelake Xeon and newer). Delaying
processing these error reports negates some of the benefit of the patrol
scrubber providing early notice of errors before they are consumed and
cause a machine check.

This series throws away the old storm implementation and replaces it
with one that keeps track of the weather on each separate machine check
bank. When a storm is detected from a bank. On Intel the storm is
mitigated by setting a very high threshold for corrected errors to
signal CMCI. This threshold does not affect signaling CMCI for
uncorrected errors.

Changes since last version:

0) Rebased to v6.5-rc2
1) Yazen & Boris - dropped AMD patch pending integration of AMD
machine check bank scanning with the core machine_check_poll()
function.
2) Boris - rename track_cmci_storm() as track_storm() in prep for
the day when AMD joins in - they don't call the interrupt "CMCI".
This function is now "static" and local to core.c.
3) Boris - Define new "struct storm_bank" for all the storm tracking
arrays.
4) Move the storm_poll_mode per-CPU tracker into the storm_desc
structure.
5) Define STORM_END_POLL_THRESHOLD as "29" instead of "30" with comment
that it is used as high end of a bitmask that counts from zero. Drop
the " - 1" where it is used.
6) Don't user kernel-doc format comments in mce/internal.h.

Suggested change NOT taken:
> + * If this is the first bank on this CPU to enter storm mode
> + * start polling
> + */
> + if (++storm->stormy_bank_count == 1)

if (++storm->stormy_bank_count)

> + mce_timer_kick(true);

As the comment above this code says, only want to "kick" the timer when
first bank on a core goes into storm mode. If another bank also goes
into storm while the first storm is active, then no need to "start
polling" that's already happening for the first storm.

Tony Luck (3):
x86/mce: Remove old CMCI storm mitigation code
x86/mce: Add per-bank CMCI storm mitigation
x86/mce: Handle Intel threshold interrupt storms

arch/x86/kernel/cpu/mce/internal.h | 49 ++++-
arch/x86/kernel/cpu/mce/core.c | 131 +++++++++---
arch/x86/kernel/cpu/mce/intel.c | 333 +++++++++++++----------------
3 files changed, 290 insertions(+), 223 deletions(-)


base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
--
2.40.1