RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

From: Ghannam, Yazen
Date: Thu May 16 2019 - 12:16:19 EST


> -----Original Message-----
> From: Luck, Tony <tony.luck@xxxxxxxxx>
> Sent: Thursday, May 16, 2019 10:52 AM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; bp@xxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware
>
>
> On Tue, Apr 30, 2019 at 08:32:20PM +0000, Ghannam, Yazen wrote:
> > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> > index 986de830f26e..551366c155ef 100644
> > --- a/arch/x86/kernel/cpu/mce/core.c
> > +++ b/arch/x86/kernel/cpu/mce/core.c
> > @@ -1567,10 +1567,13 @@ static void __mcheck_cpu_init_clear_banks(void)
> > for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
> > struct mce_bank *b = &mce_banks[i];
> >
> > - if (!b->init)
> > - continue;
> > - wrmsrl(msr_ops.ctl(i), b->ctl);
> > - wrmsrl(msr_ops.status(i), 0);
> > + if (b->init) {
> > + wrmsrl(msr_ops.ctl(i), b->ctl);
> > + wrmsrl(msr_ops.status(i), 0);
> > + }
> > +
> > + /* Save bits set in hardware. */
> > + rdmsrl(msr_ops.ctl(i), b->ctl);
> > }
> > }
>
> This looks like it will be a problem for Intel CPUs. If
> we take a CPU offline, and then bring it back again, we
> ues "b->ctl" to reinitialize the register in mce_reenable_cpu().
>
> But Intel SDM says at the end of section "15.3.2.1 IA32_MCi_CTL_MSRs"
>
> "P6 family processors only allow the writing of all 1s or all
> 0s to the IA32_MCi_CTL MSR."
>

I can put a vendor check on the read. Is that sufficient?

Thanks,
Yazen