Re: [PATCH v2] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

From: Josh Poimboeuf
Date: Wed May 29 2019 - 13:21:09 EST


On Wed, May 29, 2019 at 06:26:59PM +0200, Jiri Kosina wrote:
> On Wed, 29 May 2019, Josh Poimboeuf wrote:
>
> > hibernation_restore() is called by user space at runtime, via ioctl or
> > sysfs. So I think this still doesn't fix the case where you've disabled
> > CPUs at runtime via sysfs, and then resumed from hibernation. Or are we
> > declaring that this is not a supported scenario?
>
> Yeah I personally find that scenario awkward :) Anyway, cpuhp_smt_enable()
> is going to online even those potentially "manually" offlined CPUs, isn't
> it?
>
> Are you perhaps suggesting to call enable_nonboot_cpus() instead of
> cpuhp_smt_enable() here to make it more explicit?

Maybe, but I guess that wouldn't work as-is because it relies on
the frozen_cpus mask.

But maybe this is just a scenario we don't care about anyway?

I still have the question about whether we could make mwait_play_dead()
monitor a fixed address. If we could get that to work, that seems more
robust to me.

Another question. With your patch, if booted with nosmt, is SMT still
disabled after you resume from hibernation? I don't see how SMT would
get disabled again.

> > Is there are reason why maxcpus= doesn't do the CR4.MCE booted_once
> > dance?
>
> I am not sure whether it's really needed. My understanding is that the MCE
> issue happens only after primary sibling has been brought up; if that
> never happened, MCE wouldn't be broadcasted to that core at all in the
> first place.
>
> But this needs to be confirmed by Intel.

Right, but can't maxcpus= create scenarios where only the primary
sibling has been brought up?

Anyway, Thomas indicated on IRC that maxcpus= may be deprecated and
should probably be documented as such. So maybe it's another scenario
we don't care about.

--
Josh