Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

From: Borislav Petkov
Date: Fri May 17 2019 - 06:12:16 EST


On Thu, May 16, 2019 at 01:59:43PM -0700, Luck, Tony wrote:
> I think the intent of the original patch was to find out
> which bits are "implemented in hardware". I.e. throw all
> 1's at the register and see if any of them stick.

And, in addition, check ->init before showing/setting a bank:

---
@@ -2095,6 +2098,9 @@ static ssize_t show_bank(struct device *s, struct device_attribute *attr,

b = &per_cpu(mce_banks_array, s->id)[bank];

+ if (!b->init)
+ return -ENODEV;
+
return sprintf(buf, "%llx\n", b->ctl);
}

@@ -2113,6 +2119,9 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,

b = &per_cpu(mce_banks_array, s->id)[bank];

+ if (!b->init)
+ return -ENODEV;
+
b->ctl = new;
mce_restart();
---

so that you get a feedback whether the setting has even succeeded or
not. Right now we're doing "something" blindly and accepting any b->ctl
from userspace. Yeah, it is root-only but still...

> I don't object to the idea behind the patch. But if you want
> to do this you just should not modify b->ctl.
>
> So something like:
>
>
> static void __mcheck_cpu_init_clear_banks(void)
> {
> struct mce_bank *mce_banks = this_cpu_read(mce_banks_array);
> u64 tmp;
> int i;
>
> for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
> struct mce_bank *b = &mce_banks[i];
>
> if (b->init) {
> wrmsrl(msr_ops.ctl(i), b->ctl);
> wrmsrl(msr_ops.status(i), 0);
> rdmsrl(msr_ops.ctl(i), tmp);
>
> /* Check if any bits implemented in h/w */
> b->init = !!tmp;
> }

... except that we unconditionally set ->init to 1 in
__mcheck_cpu_mce_banks_init() and I think we should query it. Btw, that
name __mcheck_cpu_mce_banks_init() is hideous too. I'll fix those up. In
the meantime, how does the below look like? The change is to tickle out
from the hw whether some CTL bits stick and then use that to determine
b->init setting:

---
From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
Date: Tue, 30 Apr 2019 20:32:21 +0000
Subject: [PATCH] x86/MCE: Determine MCA banks' init state properly

The OS is expected to write all bits to MCA_CTL for each bank,
thus enabling error reporting in all banks. However, some banks
may be unused in which case the registers for such banks are
Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control
bits because of quirks, etc.

A bank can be considered uninitialized if the MCA_CTL register returns
zero. This is because either the OS did not write anything or because
the hardware is enforcing RAZ/WI for the bank.

Set a bank's init value based on if the control bits are set or not in
hardware. Return an error code in the sysfs interface for uninitialized
banks.

[ bp: Massage a bit. Discover bank init state at boot. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: "linux-edac@xxxxxxxxxxxxxxx" <linux-edac@xxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: "x86@xxxxxxxxxx" <x86@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20190430203206.104163-7-Yazen.Ghannam@xxxxxxx
---
arch/x86/kernel/cpu/mce/core.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 5bcecadcf4d9..d84b0c707d0e 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1492,9 +1492,16 @@ static int __mcheck_cpu_mce_banks_init(void)

for (i = 0; i < n_banks; i++) {
struct mce_bank *b = &mce_banks[i];
+ u64 val;

b->ctl = -1ULL;
- b->init = 1;
+
+ /* Check if any bits are implemented in h/w */
+ wrmsrl(msr_ops.ctl(i), b->ctl);
+ rdmsrl(msr_ops.ctl(i), val);
+ b->init = !!val;
+
+ wrmsrl(msr_ops.status(i), 0);
}

per_cpu(mce_banks_array, smp_processor_id()) = mce_banks;
@@ -1567,10 +1574,10 @@ static void __mcheck_cpu_init_clear_banks(void)
for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
struct mce_bank *b = &mce_banks[i];

- if (!b->init)
- continue;
- wrmsrl(msr_ops.ctl(i), b->ctl);
- wrmsrl(msr_ops.status(i), 0);
+ if (b->init) {
+ wrmsrl(msr_ops.ctl(i), b->ctl);
+ wrmsrl(msr_ops.status(i), 0);
+ }
}
}

@@ -2095,6 +2102,9 @@ static ssize_t show_bank(struct device *s, struct device_attribute *attr,

b = &per_cpu(mce_banks_array, s->id)[bank];

+ if (!b->init)
+ return -ENODEV;
+
return sprintf(buf, "%llx\n", b->ctl);
}

@@ -2113,6 +2123,9 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,

b = &per_cpu(mce_banks_array, s->id)[bank];

+ if (!b->init)
+ return -ENODEV;
+
b->ctl = new;
mce_restart();

--
2.21.0

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.