Re: [PATCH v4] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

From: Jiri Kosina
Date: Fri May 31 2019 - 17:09:03 EST


On Fri, 31 May 2019, Andy Lutomirski wrote:

> The Intel SDM Vol 3 34.10 says:
>
> If the HLT instruction is restarted, the processor will generate a
> memory access to fetch the HLT instruction (if it is
> not in the internal cache), and execute a HLT bus transaction. This
> behavior results in multiple HLT bus transactions
> for the same HLT instruction.

Which basically means that both hibernation and kexec have been broken in
this respect for gazillions of years, and seems like noone noticed. Makes
one wonder what the reason for that might be.

Either SDM is not precise and the refetch actually never happens for real
(or is always in these cases satisfied from I$ perhaps?), or ... ?

So my patch basically puts things back where they have been for ages
(while mwait is obviously much worse, as that gets woken up by the write
to the monitored address, which inevitably does happen during resume), but
seems like SDM is suggesting that we've been in a grey zone wrt RSM at
least for all those ages.

So perhaps we really should ditch resume_play_dead() altogether
eventually, and replace it with sending INIT IPI around instead (and then
waking the CPUs properly via INIT INIT START). I'd still like to do that
for 5.3 though, as that'd be slightly bigger surgery, and conservatively
put things basically back to state they have been up to now for 5.2.

Thanks,

--
Jiri Kosina
SUSE Labs