RE: [PATCH RESEND 2/5] x86/MCE: Handle MCA controls in a per_cpu way
From: Ghannam, Yazen
Date: Wed Apr 10 2019 - 12:58:20 EST
> -----Original Message-----
> From: Borislav Petkov <bp@xxxxxxxxx>
> Sent: Wednesday, April 10, 2019 11:41 AM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; tony.luck@xxxxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [PATCH RESEND 2/5] x86/MCE: Handle MCA controls in a per_cpu way
>
> On Wed, Apr 10, 2019 at 04:36:30PM +0000, Ghannam, Yazen wrote:
> > We have this case on AMD Family 17h with Bank 4. The hardware enforces
> > this bank to be Read-as-Zero/Writes-Ignored.
> >
> > This behavior is enforced whether the bank is in the middle or at the
> > end.
>
> Does num_banks contain the disabled bank? If so, then it will work.
>
Yes, unused banks in the middle are counted in the MCG_CAP[Count] value.
> > I'm thinking to redo the sysfs interface for banks in another patch
> > set. I could include a new file to indicate enabled/disabled, or maybe
> > just update the documentation to describe this case.
>
> No, the write to the bank controls should fail on a disabled bank.
>
Okay, so you're saying the sysfs access should fail if a bank is disabled. Is that correct?
Does "disabled" mean one or both of these?
Unused = RAZ/WI in hardware
Uninitialized = Not initialized by kernel due to quirks, etc.
For an unused bank, it doesn't hurt to write MCA_CTL, but really there's no reason to do so and go through mce_restart().
For an uninitialized bank, should we prevent users from overriding the kernel's settings?
Thanks,
Yazen