Re: [PATCH V2 3/6] x86/mce: Define 'SUCCOR' cpuid bit

From: Borislav Petkov
Date: Wed May 06 2015 - 14:49:35 EST


On Wed, May 06, 2015 at 06:58:55AM -0500, Aravind Gopalakrishnan wrote:
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index e535533..6d218ac 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1639,7 +1639,8 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
> break;
> case X86_VENDOR_AMD:
> mce_amd_feature_init(c);
> - mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
> + mce_flags.overflow_recov = !!(cpuid_ebx(0x80000007) & BIT(0));
> + mce_flags.succor = !!(cpuid_ebx(0x80000007) & BIT(1));

Yuck, I thought gcc would be smart enough to recognize that we're
calling CPUID twice with the same params and save us the work but nooo.
asm looks yucky.

Oh well, I had to save us the second CPUID read and beef up the commit
message:

---
From: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>
Date: Wed, 6 May 2015 06:58:55 -0500
Subject: [PATCH] x86/mce: Add support for deferred errors on AMD

Deferred errors indicate error conditions that were not corrected, but
those errors have not been consumed yet. They require no action from
S/W (or action is optional). These errors provide info about a latent
uncorrectable MCE that can occur when a poisoned data is consumed by the
processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
It indicates support for data poisoning in HW and deferred error
interrupts.

Add new bitfield to mce_vendor_flags for this. We use this to verify
presence of deferred error interrupts before we enable them in mce_amd.c

While at it, clarify comments in mce_vendor_flags to provide an
indication of usages of the bitfields.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: x86-ml <x86@xxxxxxxxxx>
Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1430913538-1415-4-git-send-email-Aravind.Gopalakrishnan@xxxxxxx
[ beef up commit message, do CPUID(8000_0007) only once. ]
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
---
arch/x86/include/asm/mce.h | 15 +++++++++++++--
arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++++--
2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 1f5a86d518db..407ced642ac1 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -117,8 +117,19 @@ struct mca_config {
};

struct mce_vendor_flags {
- __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */
- __reserved_0 : 63;
+ /*
+ * overflow recovery cpuid bit indicates that overflow
+ * conditions are not fatal
+ */
+ __u64 overflow_recov : 1,
+
+ /*
+ * SUCCOR stands for S/W UnCorrectable error COntainment
+ * and Recovery. It indicates support for data poisoning
+ * in HW and deferred error interrupts.
+ */
+ succor : 1,
+ __reserved_0 : 62;
};
extern struct mce_vendor_flags mce_flags;

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e535533d5ab8..521e5016aca6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1637,10 +1637,16 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
mce_adjust_timer = cmci_intel_adjust_timer;
break;
- case X86_VENDOR_AMD:
+
+ case X86_VENDOR_AMD: {
+ u32 ebx = cpuid_ebx(0x80000007);
+
mce_amd_feature_init(c);
- mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
+ mce_flags.overflow_recov = !!(ebx & BIT(0));
+ mce_flags.succor = !!(ebx & BIT(1));
break;
+ }
+
default:
break;
}
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/