Re: [PATCH v4] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

From: Andy Lutomirski
Date: Fri May 31 2019 - 10:50:30 EST




> On May 31, 2019, at 7:31 AM, Jiri Kosina <jikos@xxxxxxxxxx> wrote:
>
>> On Fri, 31 May 2019, Andy Lutomirski wrote:
>>
>> 2. Put the CPU all the way to sleep by sending it an INIT IPI.
>>
>> Version 2 seems very simple and robust. Is there a reason we can't do
>> it? We obviously don't want to do it for normal offline because it
>> might be a high-power state, but a cpu in the wait-for-SIPI state is
>> not going to exit that state all by itself.
>>
>> The patch to implement #2 should be short and sweet as long as we are
>> careful to only put genuine APs to sleep like this. The only downside
>> I can see is that an new kernel resuming and old kernel that was
>> booted with nosmt is going to waste power, but I don't think that's a
>> showstopper.
>
> Well, if *that* is not an issue, than the original 3-liner that just
> forces them to 'hlt' [1] would be good enough as well.
>
>

Seems okay to me as long as weâre confident we wonât get a spurious interrupt.

In general, I donât think weâre ever suppose to rely on mwait *staying* asleep. As I understand it, mwait can wake up whenever it wants, and the only real guarantee we have is that the CPU makes some effort to stay asleep until an interrupt is received or the monitor address is poked.

As a trivial example, if we are in a VM and we get scheduled out at any point between MONITOR and the eventual intentional wakeup, weâre toast. Same if we get an SMI due to bad luck or due to a thermal event happening shortly after pushing the power button to resume from hibernate.

For that matter, what actually happens if we get an SMI while halted? Does RSM go directly to sleep or does it re-fetch the HLT?

It seems to me that we should just avoid the scenario where we have IP pointed to a bogus address and we just cross our fingers and hope the CPU doesnât do anything.

I think that, as a short term fix, we should use HLT and, as a long term fix, we should either keep the CPU state fully valid or we should hard-offline the CPU using documented mechanisms, e.g. the WAIT-for-SIPI state.