Re: [PATCH 1/2] platform/x86/amd: Introduce AMD Address Translation Library

From: Borislav Petkov
Date: Tue Aug 08 2023 - 12:54:25 EST


On Tue, Aug 08, 2023 at 11:18:07AM -0400, Yazen Ghannam wrote:
> I think it would be better to avoid dependencies between independent things.

If they really are independent then I guess. Not that it all ends up in
a twisty dependency where you wish you should've merged the two
together. So think about all deps before you design this - it needs to
handle all cases without hackery.

> For example, amd_smn_read() is mostly used in amd64_edac. EDAC was the
> original user of SMN accesses, and all the SMN stuff could have been
> included in EDAC. However, SMN is not specifically for EDAC, so it was added
> to amd_nb.c to be commonly available. Currently, SMN accesses are done in
> other modules. I don't think it would have been a good idea to force other
> modules or subsystems to require EDAC to be used.

What does that have to do with this? SMN access is generic and should be
in amd_nb.c as it is needed by other stuff. EDAC, RAS, whatever are all
users of that thing.

> This is my reasoning for a separate, independent module for the translation.
> EDAC is the first user of this. But there will be future code that can
> leverage this, like CXL, and even the MCE subsystem. And, yes, mce_amd may
> be already loaded, but this isn't a given. A person may want MCE and CXL
> support without wanting to use EDAC.

Is that a real use case or just a hypothetical thing?

> Furthermore, some things using the translation will be built-in, so the
> translation module will need to be built-in.

This sounds weird.

> I agree. And I don't think much of the existing things in EDAC should be
> moved out. But this is new code, so there's an opportunity to have it in a
> more appropriate place.
>
> And, thinking on it more, this could be another example for future "common
> RAS" functionality. Isn't that why the CEC is in drivers/ras?

It is there because it doesn't need EDAC at all. If your translation
doesn't need EDAC and EDAC is going to be only a user of it, then good.

But if you're going to have to need the MCA error decoded by EDAC and
then the error translation done by this thing, then you'd need to
synchronize between the two. I'm not saying it is impossible - it should
be well thought out first though before you go coding.

> It seems like things go into EDAC because it's thought of as the de
> facto RAS location. But why have something in EDAC if it doesn't
> provide EDAC functionality? Other RAS things, like AER, APEI, etc.,
> don't live in EDAC.

AER is part of PCI so we haven't considereed tying it into EDAC. And
there wasn't any desire to do so.

As to APEI, there's ghes_edac...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette