Re: [RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT

From: Sean Christopherson
Date: Tue Sep 26 2023 - 16:29:47 EST


On Tue, Sep 26, 2023, David Woodhouse wrote:
>
>
> On 26 September 2023 19:20:24 CEST, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> >On Sat, Sep 23, 2023 at 6:44 PM Alexander Graf <graf@xxxxxxxxx> wrote:
> >> On 23.09.23 11:24, Paolo Bonzini wrote:
> >> > Why do you need it? You can just use KVM_RUN to go to sleep, and if you
> >> > get another job you kick out the vCPU with pthread_kill. (I also didn't
> >> > get the VSM reference).
> >>
> >> With the original VSM patches, we used to make a vCPU aware of the fact
> >> that it can morph into one of many VTLs. That approach turned out to be
> >> insanely intrusive and fragile and so we're currently reimplementing
> >> everything as VTLs as vCPUs. That allows us to move the majority of VSM
> >> functionality to user space. Everything we've seen so far looks as if
> >> there is no real performance loss with that approach.
> >
> >Yes, that was also what I remember, sharing the FPU somehow while
> >having separate vCPU file descriptors.
> >
> >> One small problem with that is that now user space is responsible for
> >> switching between VTLs: It determines which VTL is currently running and
> >> leaves all others (read: all other vCPUs) as stopped. That means if you
> >> are running happily in KVM_RUN in VTL0 and VTL1 gets an interrupt, user
> >> space needs to stop VTL0 and unpause VTL1 until it triggers VTL_RETURN
> >> at which point VTL1 stops execution and VTL0 runs again.
> >
> >That's with IPIs in VTL1, right? I understand now. My idea was, since
> >we need a link from VTL1 to VTL0 for the FPU, to use the same link to
> >trigger a vmexit to userspace if source VTL > destination VTL. I am
> >not sure how you would handle the case where the destination vCPU is
> >not running; probably by detecting the IPI when VTL0 restarts on the
> >destination vCPU?
> >
> >In any case, making vCPUs poll()-able is sensible.
>
> Thinking about this a bit more, even for HLT it probably isn't just as simple
> as checking for mp_state changes. If there's a REQ_EVENT outstanding for
> something like a timer delivery, that won't get handled and the IRQ actually
> delivered to the local APIC until the vCPU is actually *run*, will it?

I haven't been following this conversation, just reacting to seeing "HLT" and
"mp_state", which is a bit of a mess:

https://lore.kernel.org/all/ZMgIQ5m1jMSAogT4@xxxxxxxxxx