[PATCH v2 0/2] AMD Address Translation Library

From: Yazen Ghannam
Date: Thu Oct 05 2023 - 13:44:19 EST


Hi all,

This set adds a new library to do AMD-specific address translation. The
first use case is for translating a Unified Memory Controller (UMC)
"Normalized" address to a system physical address. Another use case will
be to do a similar translation for certain CXL configurations. The only
user is EDAC at the moment. But this can be used in MCA and CXL
subsystems too.

Since this code is very much implementation-specific, I thought it'd be
appropriate to have it as a "library module". Having the option to
build as a module helps with development, but this will likely be
'built-in' to use for MCA and CXL in production.

I had planned to include example code stubs for MCA and CXL with this
second submission. But I got occupied with other things, and I don't
want to hold off on the current set. The MCA subsystem integration is
starting development, so working patches should be ready soon.

Patch 1 adds the new code. This includes support for all current AMD
Zen-based systems with a couple of exceptions noted in the commit
message.

The code is based on AMD reference code. Much of this is arbitrary bit
arithmetic. But I tried my best to make clarifying comments and to
restructure the code to be easier to follow.

Also, I purposefully avoided "over-optimizing" for the same reason, and
also to leverage compile-time checks for bitfields, etc. For example,
there are many uses of FIELD_GET(), and this requires a constant
expression as input.

The reference code underwent a major refactor. Therefore, this current
set is fresh start. I figure it's best to match the latest reference
rather than submit another revision based on old code that will need to
be refactored anyway.

Old patch set (before refactor):
https://lore.kernel.org/r/20220127204115.384161-1-yazen.ghannam@xxxxxxx

There are many code paths that are reused between various interleaving
modes and Data Fabric revisions. And these aren't easily decoupled. So
run time checks are used for code flow rather than function pointers,
etc.

All the code is added within a single patch. Mostly, this was done to
get the "whole picture" of how things fit together. But I can break this
up into separate patches for each Data Fabric revision, if needed. I
also want to avoid taking the old code and incrementally refactoring.
Since the old code no longer matches the reference, I think it's simpler
to just add the new and delete the old.

Patch 2 removes the old code and switches the AMD64 EDAC module to use
the new code.

Link:
https://lore.kernel.org/r/20230802185504.606855-1-yazen.ghannam@xxxxxxx

v1->v2:
1) Move code to drivers/ras.
2) Add "reachable" check to header file.

Thanks,
Yazen

Yazen Ghannam (2):
RAS: Introduce AMD Address Translation Library
EDAC/amd64: Use new AMD Address Translation Library

MAINTAINERS | 7 +
drivers/edac/Kconfig | 1 +
drivers/edac/amd64_edac.c | 278 +------------
drivers/ras/Kconfig | 1 +
drivers/ras/Makefile | 1 +
drivers/ras/amd/atl/Kconfig | 19 +
drivers/ras/amd/atl/Makefile | 18 +
drivers/ras/amd/atl/access.c | 107 +++++
drivers/ras/amd/atl/core.c | 212 ++++++++++
drivers/ras/amd/atl/dehash.c | 459 +++++++++++++++++++++
drivers/ras/amd/atl/denormalize.c | 644 +++++++++++++++++++++++++++++
drivers/ras/amd/atl/internal.h | 307 ++++++++++++++
drivers/ras/amd/atl/map.c | 659 ++++++++++++++++++++++++++++++
drivers/ras/amd/atl/reg_fields.h | 603 +++++++++++++++++++++++++++
drivers/ras/amd/atl/system.c | 282 +++++++++++++
drivers/ras/amd/atl/umc.c | 53 +++
include/linux/amd-atl.h | 28 ++
17 files changed, 3403 insertions(+), 276 deletions(-)
create mode 100644 drivers/ras/amd/atl/Kconfig
create mode 100644 drivers/ras/amd/atl/Makefile
create mode 100644 drivers/ras/amd/atl/access.c
create mode 100644 drivers/ras/amd/atl/core.c
create mode 100644 drivers/ras/amd/atl/dehash.c
create mode 100644 drivers/ras/amd/atl/denormalize.c
create mode 100644 drivers/ras/amd/atl/internal.h
create mode 100644 drivers/ras/amd/atl/map.c
create mode 100644 drivers/ras/amd/atl/reg_fields.h
create mode 100644 drivers/ras/amd/atl/system.c
create mode 100644 drivers/ras/amd/atl/umc.c
create mode 100644 include/linux/amd-atl.h

--
2.34.1