[PATCH 2/2] x86/MCE: Add command line option to extend MCE Records pool

From: Naik, Avadhut
Date: Thu Feb 15 2024 - 15:18:24 EST


On 2/12/2024 11:54, Borislav Petkov wrote:
> On Mon, Feb 12, 2024 at 05:29:31PM +0000, Luck, Tony wrote:
>> Walking the structures already allocated from the genpool in the #MC
>> handler may be possible, but what is the criteria for "duplicates"?
> for each i in pool:
> memcmp(mce[i], new_mce, sizeof(struct mce));
> It'll probably need to mask out fields like ->time etc.
Also, some fields like cpuvendor, ppin, microcode will remain same
for all MCEs received on a system. Can we avoid comparing them?

We already seem to have a function mce_cmp(), introduced back in
2016, which accomplishes something similar for fatal errors. But
it only checks for bank, status, addr and misc registers. Should
we just modify this function to compare MCEs? It should work for
fatal errors too.

For my own understanding:
The general motto for #MC or interrupt contexts is *keep it short
and sweet*. Though memcmp() is fairly optimized, we would still be
running a *for* loop in MC context. In case successive back-to-back
MCEs are being received and if the pool already has a fair number of
records, wouldn't this comparison significantly extend our stay
in #MC context?
Had discussed this with Yazen, IIUC, nested MCEs are not supported
on x86. Please correct me if I am wrong in this.

Avadhut Naik