Re: [PATCH v2] KVM: X86: Ultra fast single target IPI fastpath

From: Sean Christopherson
Date: Fri Apr 10 2020 - 13:47:08 EST


On Fri, Apr 10, 2020 at 05:50:35PM +0200, Paolo Bonzini wrote:
> On 10/04/20 17:35, Sean Christopherson wrote:
> > IMO, this should come at the very end of vmx_vcpu_run(). At a minimum, it
> > needs to be moved below the #MC handling and below
> >
> > if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
> > return;
>
> Why? It cannot run in any of those cases, since the vmx->exit_reason
> won't match.

#MC and consistency checks should have "priority" over everything else.
That there isn't actually a conflict is irrelevant IMO. And it's something
that will likely confuse newbies (to VMX and/or KVM) as it won't be obvious
that the motivation was to shave a few cycles, e.g. versus some corner case
where the fastpath handling does something meaningful even on failure.

> > KVM more or less assumes vmx->idt_vectoring_info is always valid, and it's
> > not obvious that a generic fastpath call can safely run before
> > vmx_complete_interrupts(), e.g. the kvm_clear_interrupt_queue() call.
>
> Not KVM, rather vmx.c. You're right about a generic fastpath, but in
> this case kvm_irq_delivery_to_apic_fast is not touching VMX state; even
> if you have a self-IPI, the modification of vCPU state is only scheduled
> here and will happen later via either kvm_x86_ops.sync_pir_to_irr or
> KVM_REQ_EVENT.

I think what I don't like is that the fast-IPI code is buried in a helper
that masquerades as a generic fastpath handler. If that's open-coded in
vmx_vcpu_run(), I'm ok with doing the fast-IPI handler immediately after
the failure checks.

And fast-IPI aside, the code could use a bit of optimization to prioritize
successful VM-Enter, which would slot in nicely as a prep patch. Patches
(should be) following.

IMO, this is more logically correct:

vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
if (unlikely((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY))
kvm_machine_check();

if (unlikely(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
return EXIT_FASTPATH_NONE;

if (!is_guest_mode(vcpu) && vmx->exit_reason == EXIT_REASON_MSR_WRITE)
exit_fastpath = handle_fastpath_set_msr_irqoff(vcpu);
else
exit_fastpath = EXIT_FASTPATH_NONE;

And on my system, the compiler hoists fast-IPI above the #MC, e.g. moving
the fast-IPI down only adds a single macrofused uop, testb+jne for
FAILED_VMENTERY, to the code path.

0xffffffff81067d1d <+701>: vmread %rax,%rax
0xffffffff81067d20 <+704>: ja,pt 0xffffffff81067d2d <vmx_vcpu_run+717>
0xffffffff81067d23 <+707>: pushq $0x0
0xffffffff81067d25 <+709>: push %rax
0xffffffff81067d26 <+710>: callq 0xffffffff81071790 <vmread_error_trampoline>
0xffffffff81067d2b <+715>: pop %rax
0xffffffff81067d2c <+716>: pop %rax
0xffffffff81067d2d <+717>: test %eax,%eax
0xffffffff81067d2f <+719>: mov %eax,0x32b0(%rbp)
0xffffffff81067d35 <+725>: js 0xffffffff81067d5a <vmx_vcpu_run+762>
0xffffffff81067d37 <+727>: testb $0x20,0x2dc(%rbp)
0xffffffff81067d3e <+734>: jne 0xffffffff81067d49 <vmx_vcpu_run+745>
0xffffffff81067d40 <+736>: cmp $0x20,%eax
0xffffffff81067d43 <+739>: je 0xffffffff810686d4 <vmx_vcpu_run+3188> <-- fastpath handler
0xffffffff81067d49 <+745>: xor %ebx,%ebx
0xffffffff81067d4b <+747>: jmpq 0xffffffff81067e65 <vmx_vcpu_run+1029>