Re: [PATCH v2] KVM: x86: avoid redundant REQ_EVENT

From: Paolo Bonzini
Date: Sun Dec 18 2016 - 16:04:16 EST




On 15/12/2016 15:56, Roman Kagan wrote:
> On Thu, Dec 15, 2016 at 03:32:45PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 15/12/2016 15:30, Radim KrÄmÃÅ wrote:
>>>
>>> One useless round of KVM_REQ_EVENT is not going change nested
>>> performance by much and it is not the only thing we could improve wrt.
>>> TPR ... I would just leave it for now and take care of it when we
>>> * don't to update PPR at all with APICv -- it is already correct
>>> * drop the KVM_REQ_EVENT with flex priority, because lower TPR cannot
>>> unmask an interrupt
>>
>> I agree. I still don't like the patch very much, because I feel like an
>> explicit state machine ("can KVM_REQ_EVENT do anything?") would be more
>> maintainable.
>
> We all seem to share that feeling towards this patch :) That's the
> reason why it was baking here internally for a long time: Denis
> discovered this scenario over a month ago while analyzing the
> performance regressions in KVM against our proprietary hypervisor, but
> pinning down a palatable and safe fix turned out to be a challenge.
>
> I think we did our best to stay safe; I agree that it ended up no so
> beautiful indeed.

Yes, and it's fine. Now the thing to do is to find something that
perhaps looks less safe, but actually shows the underlying invariants
better. It doesn't even need any kind of state machine, because the
state machine is already implicit in the vmexits.

In particular, KVM_REQ_EVENT is only needed in the following cases:

1) an interrupt is injected (we could optimize self-IPIs but we don't)

2) PPR is lowered so that old-PPR >= IRR and new-PPR < IRR; this can
only happen:

2.1) from a TPR write (CR8-write vmexit, TPR-below-threshold vmexit, TPR
write through MMIO or MSR)

2.2) from an EOI's clearing of a bit in ISR

3) the interrupt window opens


In particular, this list does not include:

- setting a bit in ISR, as in kvm_get_apic_interrupt

- clearing a bit in IRR, again as in kvm_get_apic_interrupt

- anything on EOI besides what apic_update_ppr already does

- changes to PPR that didn't cause a vmexit: if the TPR changes but it
doesn't trigger a vmexit---because of TPR_THRESHOLD on vmx or
update_cr8_intercept on svm---then the TPR change must not have unmasked
an interrupt, and KVM_REQ_EVENT is unnecessary.


So in the end, as in the APICv case but hopefully with less pitfalls,
the extra KVM_REQ_EVENT are not only slowing down the code, but the
apparent extra safety has the cost of obfuscating it. I'll see if I can
come up with something not too ugly.

Paolo