Re: [RESEND PATCH 5/6] KVM: x86/VMX: add kvm_vmx_reinject_nmi_irq() for NMI/IRQ reinjection

From: H. Peter Anvin
Date: Fri Nov 11 2022 - 17:23:11 EST


On November 11, 2022 8:35:30 AM PST, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx> wrote:
>On 11/11/2022 14:23, Peter Zijlstra wrote:
>> On Fri, Nov 11, 2022 at 01:48:26PM +0100, Paolo Bonzini wrote:
>>> On 11/11/22 13:19, Peter Zijlstra wrote:
>>>> On Fri, Nov 11, 2022 at 01:04:27PM +0100, Paolo Bonzini wrote:
>>>>> On Intel you can optionally make it hold onto IRQs, but NMIs are always
>>>>> eaten by the VMEXIT and have to be reinjected manually.
>>>> That 'optionally' thing worries me -- as in, KVM is currently
>>>> opting-out?
>>> Yes, because "If the “process posted interrupts” VM-execution control is 1,
>>> the “acknowledge interrupt on exit” VM-exit control is 1" (SDM 26.2.1.1,
>>> checks on VM-Execution Control Fields). Ipse dixit. Posted interrupts are
>>> available and used on all processors since I think Ivy Bridge.
>
>On server SKUs.  Client only got "virtual interrupt processing" fairly
>recently IIRC, which is the CPU-side property which matters.
>
>> (imagine the non-coc compliant reaction here)
>>
>> So instead of fixing it, they made it worse :-(
>>
>> And now FRED is arguably making it worse again, and people wonder why I
>> hate virt...
>
>The only FRED-compatible fix is to send a self-NMI, because because you
>may need a CSL change too.
>
>VT-x *does* hold the NMI latch (for VMEXIT_REASON NMI), so it's self-NMI
>and then enable_nmi()s.
>
>Except the IRET to self won't work - it will need to be ERETS-to-self. 
>Which I think is fine.
>
>But what isn't fine is the fact that a self-NMI doesn't deliver
>synchronously, so you need to wait until it is pending, before enabling
>NMIs.  (Well, actually you need to ensure that it's definitely delivered
>before re-entering the VM).
>
>And I'm totally out of ideas here...
>
>~Andrew
>

There is no fundamental reason to do a CSL/IST change if you happen to know a priori that the stack is in a valid state to have the NMI frame on it; that is:

1. Not deep into a nested I/O layer;
2. Valid, and not in flux in any way.

Since this reinject will always be in a well-defined location, that's fine.

So I think *that* concern is not actually an issue.

Again, note that this is not a FRED-specific problem.