Re: [patch 0/6] Cure kexec() vs. mwait_play_dead() troubles

From: Sean Christopherson
Date: Wed Jun 07 2023 - 23:46:31 EST


On Wed, Jun 07, 2023, Ashok Raj wrote:
> On Wed, Jun 07, 2023 at 10:33:35AM -0700, Sean Christopherson wrote:
> > On Wed, Jun 07, 2023, Ashok Raj wrote:
> > > On Tue, Jun 06, 2023 at 12:41:43AM +0200, Thomas Gleixner wrote:
> > > > >> So parking them via INIT is not completely solving the problem, but it
> > > > >> takes at least NMI and SMI out of the picture.
> > > > >
> > > > > Don't most SMM handlers rendezvous all CPUs? I.e. won't blocking SMIs indefinitely
> > > > > potentially cause problems too?
> > > >
> > > > Not that I'm aware of. If so then this would be a hideous firmware bug
> > > > as firmware must be aware of CPUs which hang around in INIT independent
> > > > of this.
> > >
> > > SMM does do the rendezvous of all CPUs, but also has a way to detect the
> > > blocked ones, in WFS via some package scoped ubox register. So it knows to
> > > skip those. I can find this in internal sources, but they aren't available
> > > in the edk2 open reference code. They happen to be documented only in the
> > > BWG, which isn't available freely.
> >
> > Ah, so putting CPUs into WFS shouldn't result in odd delays. At least not on
> > bare metal. Hmm, and AFAIK the primary use case for SMM in VMs is for secure
>
> Never knew SMM had any role in VM's.. I thought SMM was always native.
>
> Who owns this SMM for VM's.. from the VirtualBIOS?

Yes?

> > boot, so taking SMIs after booting and putting CPUs back into WFS should be ok-ish.
> >
> > Finding a victim to test this in a QEMU VM w/ Secure Boot would be nice to have.
>
> I always seem to turn off secureboot installing Ubuntu :-)

Yeah, I don't utilize it in any of my VMs either.

> I'll try to find someone who might know especially doing SMM In VM.
>
> Can you tell what needs to be validated in the guest? Would doing kexec
> inside the guest with the new patch set be sufficient?
>
> Or you mean in guest, do a kexec and launch secure boot of new kernel?

Yes? I don't actually have hands on experience with such a setup, I'm familiar
with it purely through bug reports, e.g. this one

https://lore.kernel.org/all/BYAPR12MB301441A16CE6CFFE17147888A0A09@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

> If there is a specific test you want done, let me know.

Smoke testing is all I was thinking. I wouldn't put too much effort into trying
to make sure this all works. Like I said earlier, nice to have, but certainly not
necessary.