Re: [RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT

From: Paolo Bonzini
Date: Sat Sep 23 2023 - 05:25:40 EST


On 9/23/23 09:22, David Woodhouse wrote:
On Fri, 2023-09-22 at 14:00 +0200, Paolo Bonzini wrote:
To avoid races you need two flags though; there needs to be also a
kernel->userspace communication of whether the vCPU is currently in
HLT or MWAIT, using the "flags" field for example. If it was HLT only,
moving the mp_state in kvm_run would seem like a good idea; but not if
MWAIT or PAUSE are also included.

Right. When work is added to an empty workqueue, the VMM will want to
hunt for a vCPU which is currently idle and then signal it to exit.

As you say, for HLT it's simple enough to look at the mp_state, and we
can move that into kvm_run so it doesn't need an ioctl...

Looking at it again: not so easy because the mpstate is changed in the vCPU thread by vcpu_block() itself.

although it
would also be nice to get an *event* on an eventfd when the vCPU
becomes runnable (as noted, we want that for VSM anyway). Or perhaps
even to be able to poll() on the vCPU fd.

Why do you need it? You can just use KVM_RUN to go to sleep, and if you get another job you kick out the vCPU with pthread_kill. (I also didn't get the VSM reference).

An interesting quirk is that kvm_run->immediate_exit is processed before kvm_vcpu_block(), but TIF_SIGPENDING is processed afterwards. This means that you can force an mpstate update with pthread_kill + KVM_RUN. It's not going to be a speed demon, but it's worth writing a selftest for it.

But MWAIT (as currently not-really-emulated) and PAUSE are both just
transient states with nothing you can really *wait* for, which is why
they're such fun to deal with.

PAUSE is easier because it is just momentary and you stick it inside what's already a busy wait. MWAIT is less fun because you don't really want to busy wait.

Paolo