[PATCH v2 5/7] EDAC/amd64: Decode syndrome before translating address

From: Ghannam, Yazen
Date: Tue Jul 09 2019 - 17:57:19 EST


From: Yazen Ghannam <yazen.ghannam@xxxxxxx>

AMD Family 17h systems currently require address translation in order to
report the system address of a DRAM ECC error. This is currently done
before decoding the syndrome information. The syndrome information does
not depend on the address translation, so the proper EDAC csrow/channel
reporting can function without the address. However, the syndrome
information will not be decoded if the address translation fails.

Decode the syndrome information before doing the address translation.
The syndrome information is architecturally defined in MCA_SYND and can
be considered robust. The address translation is system-specific and may
fail on newer systems without proper updates to the translation
algorithm.

Fixes: 713ad54675fd ("EDAC, amd64: Define and register UMC error decode function")
Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
---
Link:
https://lkml.kernel.org/r/20190531234501.32826-7-Yazen.Ghannam@xxxxxxx

v1->v2:
* No change.

drivers/edac/amd64_edac.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index f0424c10cac0..4058b24b8e04 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2567,13 +2567,6 @@ static void decode_umc_error(int node_id, struct mce *m)

err.channel = find_umc_channel(m);

- if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) {
- err.err_code = ERR_NORM_ADDR;
- goto log_error;
- }
-
- error_address_to_page_and_offset(sys_addr, &err);
-
if (!(m->status & MCI_STATUS_SYNDV)) {
err.err_code = ERR_SYND;
goto log_error;
@@ -2590,6 +2583,13 @@ static void decode_umc_error(int node_id, struct mce *m)

err.csrow = m->synd & 0x7;

+ if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) {
+ err.err_code = ERR_NORM_ADDR;
+ goto log_error;
+ }
+
+ error_address_to_page_and_offset(sys_addr, &err);
+
log_error:
__log_ecc_error(mci, &err, ecc_type);
}
--
2.17.1