Re: The vcpu won't be wakened for a long time

From: Sean Christopherson
Date: Thu Dec 16 2021 - 10:42:42 EST


On Thu, Dec 16, 2021, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> > What kernel version? There have been a variety of fixes/changes in the
> > area in recent kernels.
>
> The kernel version is 4.18, and it seems the latest kernel also has this problem.
>
> The following code can fixes this bug, I've tested it on 4.18.
>
> (4.18)
>
> @@ -3944,6 +3944,11 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
> if (pi_test_and_set_on(&vmx->pi_desc))
> return;
>
> + if (swq_has_sleeper(kvm_arch_vcpu_wq(vcpu))) {
> + kvm_vcpu_kick(vcpu);
> + return;
> + }
> +
> if (vcpu != kvm_get_running_vcpu() &&
> !kvm_vcpu_trigger_posted_interrupt(vcpu, false))
> kvm_vcpu_kick(vcpu);
>
>
> (latest)
>
> @@ -3959,6 +3959,11 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
> if (pi_test_and_set_on(&vmx->pi_desc))
> return 0;
>
> + if (rcuwait_active(&vcpu->wait)) {
> + kvm_vcpu_kick(vcpu);
> + return 0;
> + }
> +
> if (vcpu != kvm_get_running_vcpu() &&
> !kvm_vcpu_trigger_posted_interrupt(vcpu, false))
> kvm_vcpu_kick(vcpu);
>
> Do you have any suggestions ?

Hmm, that strongly suggests the "vcpu != kvm_get_running_vcpu()" is at fault.
Can you try running with the below commit? It's currently sitting in kvm/queue,
but not marked for stable because I didn't think it was possible for the check
to a cause a missed wake event in KVM's current code base.

commit 6a8110fea2c1b19711ac1ef718680dfd940363c6
Author: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Wed Dec 8 01:52:27 2021 +0000

KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU

Drop a check that guards triggering a posted interrupt on the currently
running vCPU, and more importantly guards waking the target vCPU if
triggering a posted interrupt fails because the vCPU isn't IN_GUEST_MODE.
The "do nothing" logic when "vcpu == running_vcpu" works only because KVM
doesn't have a path to ->deliver_posted_interrupt() from asynchronous
context, e.g. if apic_timer_expired() were changed to always go down the
posted interrupt path for APICv, or if the IN_GUEST_MODE check in
kvm_use_posted_timer_interrupt() were dropped, and the hrtimer fired in
kvm_vcpu_block() after the final kvm_vcpu_check_block() check, the vCPU
would be scheduled() out without being awakened, i.e. would "miss" the
timer interrupt.

One could argue that invoking kvm_apic_local_deliver() from (soft) IRQ
context for the current running vCPU should be illegal, but nothing in
KVM actually enforces that rules. There's also no strong obvious benefit
to making such behavior illegal, e.g. checking IN_GUEST_MODE and calling
kvm_vcpu_wake_up() is at worst marginally more costly than querying the
current running vCPU.

Lastly, this aligns the non-nested and nested usage of triggering posted
interrupts, and will allow for additional cleanups.

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
Message-Id: <20211208015236.1616697-18-seanjc@xxxxxxxxxx>
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 38749063da0e..f61a6348cffd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3995,8 +3995,7 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
* guaranteed to see PID.ON=1 and sync the PIR to IRR if triggering a
* posted interrupt "fails" because vcpu->mode != IN_GUEST_MODE.
*/
- if (vcpu != kvm_get_running_vcpu() &&
- !kvm_vcpu_trigger_posted_interrupt(vcpu, false))
+ if (!kvm_vcpu_trigger_posted_interrupt(vcpu, false))
kvm_vcpu_wake_up(vcpu);

return 0;