RE: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu

From: Ghannam, Yazen
Date: Tue May 21 2019 - 13:55:25 EST


> -----Original Message-----
> From: Borislav Petkov <bp@xxxxxxxxx>
> Sent: Saturday, May 18, 2019 6:26 AM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; bp@xxxxxxx; tony.luck@xxxxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu
>
>
> On Tue, Apr 30, 2019 at 08:32:20PM +0000, Ghannam, Yazen wrote:
> > From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
> >
> > The number of MCA banks is provided per logical CPU. Historically, this
> > number has been the same across all CPUs, but this is not an
> > architectural guarantee. Future AMD systems may have MCA bank counts
> > that vary between logical CPUs in a system.
> >
> > This issue was partially addressed in
> >
> > 006c077041dc ("x86/mce: Handle varying MCA bank counts")
> >
> > by allocating structures using the maximum number of MCA banks and by
> > saving the maximum MCA bank count in a system as the global count. This
> > means that some extra structures are allocated. Also, this means that
> > CPUs will spend more time in the #MC and other handlers checking extra
> > MCA banks.
>
> ...
>
> > @@ -1480,14 +1482,15 @@ EXPORT_SYMBOL_GPL(mce_notify_irq);
> >
> > static int __mcheck_cpu_mce_banks_init(void)
> > {
> > + u8 n_banks = this_cpu_read(mce_num_banks);
> > struct mce_bank *mce_banks;
> > int i;
> >
> > - mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL);
> > + mce_banks = kcalloc(n_banks, sizeof(struct mce_bank), GFP_KERNEL);
>
> Something changed in mm land or maybe we were lucky and got away with an
> atomic GFP_KERNEL allocation until now but:
>
> [ 2.447838] smp: Bringing up secondary CPUs ...
> [ 2.456895] x86: Booting SMP configuration:
> [ 2.457822] .... node #0, CPUs: #1

The issue seems to be that the allocation is now happening on CPUs other than CPU0.

Patch 2 in this set has the same issue. I didn't see it until I turned on the "Lock Debugging" config options.

> [ 1.344284] BUG: sleeping function called from invalid context at mm/slab.h:418

This message comes from ___might_sleep() which checks the system_state.

On CPU0, system_state=SYSTEM_BOOTING.

On every other CPU, system_state=SYSTEM_SCHEDULING, and that's the only system_state where the message is shown.

Changing GFP_KERNEL to GFP_ATOMIC seems to be a fix. Is this appropriate? Or do you think there's something else we could try?

Thanks,
Yazen