RE: [PATCH 2/2] x86/MCE: Add command line option to extend MCE Records pool

From: Luck, Tony
Date: Mon Feb 12 2024 - 12:30:11 EST


> And here's the simplest scheme: you don't extend the buffer. On
> overflow, you say "Buffer full, here's the MCE" and you dump the error
> long into dmesg. Problem solved.
>
> A slicker deduplication scheme would be even better, tho. Maybe struct
> mce.count which gets incremented instead of adding the error record to
> the buffer again. And so on...

Walking the structures already allocated from the genpool in the #MC
handler may be possible, but what is the criteria for "duplicates"?
Do we avoid entering duplicates into the pool altogether? Or when the pool
is full overwrite a duplicate?

How about compile time allocation of extra space. Algorithm below for
illustrative purposes only. May need some more thought about how
to scale up.

-Tony

[Diff pasted into Outlook, chances that it will automatically apply = 0%]

diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index fbe8b61c3413..0fc2925c0839 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -16,10 +16,15 @@
* used to save error information organized in a lock-less list.
*
* This memory pool is only to be used to save MCE records in MCE context.
- * MCE events are rare, so a fixed size memory pool should be enough. Use
- * 2 pages to save MCE events for now (~80 MCE records at most).
+ * MCE events are rare, so a fixed size memory pool should be enough.
+ * Low CPU count systems allocate 2 pages (enough for ~64 "struct mce"
+ * records). Large systems scale up the allocation based on CPU count.
*/
+#if CONFIG_NR_CPUS < 128
#define MCE_POOLSZ (2 * PAGE_SIZE)
+#else
+#define MCE_POOLSZ (CONFIG_NR_CPUS / 64 * PAGE_SIZE)
+#endif

static struct gen_pool *mce_evt_pool;
static LLIST_HEAD(mce_event_llist);
[agluck@agluck-desk3 mywork]$ vi arch/x86/kernel/cpu/mce/genpool.c
[agluck@agluck-desk3 mywork]$ git diff
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index fbe8b61c3413..47bf677578ca 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -16,10 +16,15 @@
* used to save error information organized in a lock-less list.
*
* This memory pool is only to be used to save MCE records in MCE context.
- * MCE events are rare, so a fixed size memory pool should be enough. Use
- * 2 pages to save MCE events for now (~80 MCE records at most).
+ * MCE events are rare, but scale up with CPU count. Low CPU count
+ * systems allocate 2 pages (enough for ~64 "struct mce" records). Large
+ * systems scale up the allocation based on CPU count.
*/
+#if CONFIG_NR_CPUS < 128
#define MCE_POOLSZ (2 * PAGE_SIZE)
+#else
+#define MCE_POOLSZ (CONFIG_NR_CPUS / 64 * PAGE_SIZE)
+#endif

static struct gen_pool *mce_evt_pool;
static LLIST_HEAD(mce_event_llist);