Re: [PATCH v5 untested] kvm: better MWAIT emulation for guests

From: Gabriel Somlo
Date: Tue Mar 21 2017 - 18:51:34 EST


On Tue, Mar 21, 2017 at 08:22:39PM +0100, Radim KrÄmÃÅ wrote:
> 2017-03-21 10:29-0700, Nadav Amit:
> >
> > > On Mar 21, 2017, at 9:58 AM, Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx> wrote:
> >
> > > In '-smp 2', the writing VCPU always does 10000 wakeups by writing into
> > > monitored memory, but the mwaiting VCPU can be also woken up by host
> > > interrupts, which might add a few exits depending on timing.
> > >
> > > I didn't spend much time in making the PASS/FAIL mean much, or ensuring
> > > that we only get 10000 wakeups ... it is nothing to be worried about.
> > >
> > > Hint 240 behaves as nop even on my system, so I still don't find
> > > anything insane on that machine (if OS X is exluded) ...

And I get the exact same results on the MacBookAir4,2 (which exhibits
no freezing or extreme sluggishness when running OS X 10.7 smp with
Michael's KVM MWAIT-in-L1 patch)...

> >
> > From my days in Intel (10 years ago), I can say that MWAIT wakes for many
> > microarchitecural events beside interrupts.
> >
> > Out of curiosity, arenât you worried that on OS X the wbinvd causes an exit
> > after the monitor and before the mwait?
>
> VM entry clears the monitoring, so it should behave just like an MWAIT
> without MONITOR, which is NOP according to the spec. It does so on
> modern hardware, but it definitely is a good thing to try ...
> (I am worried about disabling MWAIT exits by default and it's a no-go
> until we understand why OS X doesn't work.)
>
> Gabriel, how does testing with this change behave on the old machine?
>
> Thanks.
>
> ---8<---
> This should be the same as "wbinvd", because "wbinvd" does nothing
> without non-coherent vfio.
> Simply replacing "vmcall" with "wbinvd" is an option if the "vmcall"
> version works as expected.
> ---
> diff --git a/x86/mwait.c b/x86/mwait.c
> index 20f4dcbff8ae..19f988b94541 100644
> --- a/x86/mwait.c
> +++ b/x86/mwait.c
> @@ -54,6 +54,7 @@ int main(int argc, char **argv)
>
> while ((smp ? *page : resumes) < TARGET_RESUMES) {
> asm volatile("monitor" :: "a" (page), "c" (0), "d" (0));
> + asm volatile("vmcall" :: "a"(-1));
> asm volatile("mwait" :: "a" (eax), "c" (ecx));
> resumes++;
> }

Sure thing, here's the MacPro1,1 results:

[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real 0m1.709s
user 0m0.547s
sys 0m0.243s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real 0m0.752s
user 0m0.545s
sys 0m0.218s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0 -smp 2
enabling apic
enabling apic
FAIL: resumed from mwait 10004 times
SUMMARY: 1 tests, 1 unexpected failures

real 0m0.753s
user 0m0.554s
sys 0m0.227s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1 -smp 2
enabling apic
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real 0m0.755s
user 0m0.562s
sys 0m0.221s


For comparison, the resuls including 'vmcall' on the MacBookAir4,2
(interesting, the results for the last test, "-append '240 1 1' -smp 2",
are different):

[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real 0m0.622s
user 0m0.501s
sys 0m0.130s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real 0m0.624s
user 0m0.504s
sys 0m0.127s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0 -smp 2
enabling apic
enabling apic
FAIL: resumed from mwait 10023 times
SUMMARY: 1 tests, 1 unexpected failures

real 0m0.623s
user 0m0.544s
sys 0m0.110s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1 -smp 2
enabling apic
enabling apic
FAIL: resumed from mwait 10006 times
SUMMARY: 1 tests, 1 unexpected failures

real 0m0.618s
user 0m0.527s
sys 0m0.121s

HTH,
--Gabriel