Re: [PATCH v2] x86/mce: Defer processing early errors until mcheck_late_init()

From: Borislav Petkov
Date: Mon Aug 23 2021 - 16:51:02 EST


On Mon, Aug 23, 2021 at 01:41:22PM -0700, Luck, Tony wrote:
> arch/x86/kernel/cpu/mce/core.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)

I actually had a different idea in mind, considering the fact that some
machinery to only log the early MCEs is already there. And this fits
more naturally in the flow and doesn't need a bool switch.

Hmmm?

---
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 0607ec4f5091..9b13cca74f65 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -265,6 +265,7 @@ enum mcp_flags {
MCP_TIMESTAMP = BIT(0), /* log time stamp */
MCP_UC = BIT(1), /* log uncorrected errors */
MCP_DONTLOG = BIT(2), /* only clear, don't log */
+ MCP_LOG_ONLY = BIT(3), /* log only */
};
bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b);

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 22791aadc085..bb691503c2e4 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -817,7 +817,10 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
if (mca_cfg.dont_log_ce && !mce_usable_address(&m))
goto clear_it;

- mce_log(&m);
+ if (flags & MCP_LOG_ONLY)
+ mce_gen_pool_add(&m);
+ else
+ mce_log(&m);

clear_it:
/*
@@ -1639,10 +1642,12 @@ static void __mcheck_cpu_init_generic(void)
m_fl = MCP_DONTLOG;

/*
- * Log the machine checks left over from the previous reset.
+ * Log the machine checks left over from the previous reset. Log them
+ * only, do not start processing them. That will happen in mcheck_late_init()
+ * when all consumers have been registered on the notifier chain.
*/
bitmap_fill(all_banks, MAX_NR_BANKS);
- machine_check_poll(MCP_UC | m_fl, &all_banks);
+ machine_check_poll(MCP_UC | MCP_LOG_ONLY | m_fl, &all_banks);

cr4_set_bits(X86_CR4_MCE);


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette