[PATCH] KVM: VMX: Fix NMI event loss

From: Tianyi Liu
Date: Mon Aug 28 2023 - 05:09:40 EST


Hi, Sean:

I have found that in the latest version of the kernel, some PMU events are
being lost. I used bisect and found out the breaking commit [1], which
moved the handling of NMI events from `handle_exception_irqoff` to
`vmx_vcpu_enter_exit`.

If I revert this part as done in this patch, it works correctly. However,
I'm not really familiar with KVM, and I'm not sure about the intent behind
the original patch [1]. Could you please take a look on this? Thanks a lot.

My use case is to sample the IP of guest OS using `perf kvm`:
`perf kvm --guest record -a -g -e instructions -F 10000 -- sleep 1`

If it works correctly, it will record about 10000 samples (as `-F 10000`)
and it will say:
`[ perf record: Captured and wrote 0.9 MB perf.data.guest (9729 samples) ]`
And if not, it will only record ~100 samples, sometimes no sample at all.

If it's useful for your debug, The callchain of `vmx_vcpu_enter_exit` is:
vmx_vcpu_enter_exit
vmx_vcpu_run
kvm_x86_vcpu_run
vcpu_enter_guest

While the callchain of `handle_exception_irqoff` is:
handle_exception_irqoff
vmx_handle_exit_irqoff
kvm_x86_handle_exit_irqoff
vcpu_enter_guest

[1] https://lore.kernel.org/all/20221213060912.654668-8-seanjc@xxxxxxxxxx/

Signed-off-by: Tianyi Liu <i.pear@xxxxxxxxxxx>
---
arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index df461f387e20..3a0b13867a6b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6955,6 +6955,12 @@ static void handle_exception_irqoff(struct vcpu_vmx *vmx)
/* Handle machine checks before interrupts are enabled */
else if (is_machine_check(intr_info))
kvm_machine_check();
+ /* We need to handle NMIs before interrupts are enabled */
+ else if (is_nmi(intr_info)) {
+ kvm_before_interrupt(&vmx->vcpu, KVM_HANDLING_NMI);
+ vmx_do_nmi_irqoff();
+ kvm_after_interrupt(&vmx->vcpu);
+ }
}

static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
@@ -7251,13 +7257,6 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
else
vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);

- if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
- is_nmi(vmx_get_intr_info(vcpu))) {
- kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
- vmx_do_nmi_irqoff();
- kvm_after_interrupt(vcpu);
- }
-
guest_state_exit_irqoff();
}

--
2.41.0