Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

From: Luck, Tony
Date: Thu May 16 2019 - 17:01:51 EST


On Thu, May 16, 2019 at 10:34:56PM +0200, Borislav Petkov wrote:
> On Thu, May 16, 2019 at 08:20:58PM +0000, Ghannam, Yazen wrote:
> > We don't actually know if there are bits set in hardware until we read
> > it back. So I don't think this is adding anything new.
>
> Bah, of course. We need to read it first (pasting the whole function).
> Now, __mcheck_cpu_init_clear_banks() gets called when we change
> configuration too, in mce_cpu_restart() and if we do it this way, we'll
> be rereading MCi_CTL each time but I don't see anything wrong with that.

Intel doesn't "set any bits in hardware" ... so I think you'll just
get a 0x0 and disable everything.

>
> Hmmm?
>
> static void __mcheck_cpu_init_clear_banks(void)
> {
> struct mce_bank *mce_banks = this_cpu_read(mce_banks_array);
> int i;
>
> for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
> struct mce_bank *b = &mce_banks[i];
>
> rdmsrl(msr_ops.ctl(i), b->ctl);
>
> /* Bank is initialized if bits are set in hardware. */
> b->init = !!b->ctl;
> if (b->init) {
> wrmsrl(msr_ops.ctl(i), b->ctl);
> wrmsrl(msr_ops.status(i), 0);
> }
>
> }
> }


I think the intent of the original patch was to find out
which bits are "implemented in hardware". I.e. throw all
1's at the register and see if any of them stick.

I don't object to the idea behind the patch. But if you want
to do this you just should not modify b->ctl.

So something like:


static void __mcheck_cpu_init_clear_banks(void)
{
struct mce_bank *mce_banks = this_cpu_read(mce_banks_array);
u64 tmp;
int i;

for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
struct mce_bank *b = &mce_banks[i];

if (b->init) {
wrmsrl(msr_ops.ctl(i), b->ctl);
wrmsrl(msr_ops.status(i), 0);
rdmsrl(msr_ops.ctl(i), tmp);

/* Check if any bits implemented in h/w */
b->init = !!tmp;
}

}
}

-Tony

-Tony