Re: [PATCH 00/25] AMD MCA Address Translation Updates

From: Borislav Petkov
Date: Wed May 19 2021 - 10:32:41 EST


On Tue, May 18, 2021 at 11:52:07PM -0400, Yazen Ghannam wrote:
> I think this is a good idea. The only hang up is that we should be using
> the output of this function, i.e. the systeme physical address, when
> handling memory errors in the MCE notifier blocks. But I have an idea
> where we can handle this. I can send that as a follow up series, if
> that's okay.

Yeah, so frankly I'm not happy with all this clumsy plugging we do
with notifiers and then amd_register_ecc_decoder() which is called in
mce_amd.c to be used in amd64_edac.c which finally logs it.

What I'd like to see is mce_amd.c still decoding all kinds of errors and
additionally amd64_edac continues processing only the DRAM errors.

> One other issue is what if a user doesn't want to use amd64_edac_mod?

Then she/he doesn't get DRAM errors mapped to a DIMM - simple.

> This is more of a user preference and/or configuration issue. Maybe the
> module loads, but an uninterested user can tell EDAC to not log errors,
> etc.? Or should the translation code live in its own module?

No need. Translation is part of EDAC so if you don't load it, you don't
get the functionality.

> So for version 2, I have 1) Add a glossary of terms, and 2) Move
> everything to EDAC. Any other comments?

None at the moment - I'll do a deeper review with v2.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette