Re: [PATCH V4] KVM: x86: Sync the pending Posted-Interrupts

From: Paolo Bonzini
Date: Thu Feb 14 2019 - 05:54:57 EST


On 14/02/19 03:48, Luwei Kang wrote:
> Some Posted-Interrupts from passthrough devices may be lost or
> overwritten when the vCPU is in runnable state.
>
> The SN (Suppress Notification) of PID (Posted Interrupt Descriptor) will
> be set when the vCPU is preempted (vCPU in KVM_MP_STATE_RUNNABLE state
> but not running on physical CPU). If a posted interrupt coming at this
> time, the irq remmaping facility will set the bit of PIR (Posted
> Interrupt Requests) without ON (Outstanding Notification).
> So this interrupt can't be sync to APIC virtualization register and
> will not be handled by Guest because ON is zero.
>
> Signed-off-by: Luwei Kang <luwei.kang@xxxxxxxxx>

Queued, thanks.

Paolo

> ---
> arch/x86/kvm/vmx/vmx.c | 26 +++++++++++---------------
> arch/x86/kvm/vmx/vmx.h | 6 ++++++
> arch/x86/kvm/x86.c | 2 +-
> 3 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f6915f1..fe59199 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1192,21 +1192,6 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
> return;
>
> - /*
> - * First handle the simple case where no cmpxchg is necessary; just
> - * allow posting non-urgent interrupts.
> - *
> - * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> - * PI.NDST: pi_post_block will do it for us and the wakeup_handler
> - * expects the VCPU to be on the blocked_vcpu_list that matches
> - * PI.NDST.
> - */
> - if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR ||
> - vcpu->cpu == cpu) {
> - pi_clear_sn(pi_desc);
> - return;
> - }
> -
> /* The full case. */
> do {
> old.control = new.control = pi_desc->control;
> @@ -1221,6 +1206,17 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> new.sn = 0;
> } while (cmpxchg64(&pi_desc->control, old.control,
> new.control) != old.control);
> +
> + /*
> + * Clear SN before reading the bitmap. The VT-d firmware
> + * writes the bitmap and reads SN atomically (5.2.3 in the
> + * spec), so it doesn't really have a memory barrier that
> + * pairs with this, but we cannot do that and we need one.
> + */
> + smp_mb__after_atomic();
> +
> + if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
> + pi_set_on(pi_desc);
> }
>
> /*
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9932895..a4527e1 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -349,6 +349,12 @@ static inline void pi_set_sn(struct pi_desc *pi_desc)
> (unsigned long *)&pi_desc->control);
> }
>
> +static inline void pi_set_on(struct pi_desc *pi_desc)
> +{
> + set_bit(POSTED_INTR_ON,
> + (unsigned long *)&pi_desc->control);
> +}
> +
> static inline void pi_clear_on(struct pi_desc *pi_desc)
> {
> clear_bit(POSTED_INTR_ON,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3d32b8f..ebd6737 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7795,7 +7795,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> * 1) We should set ->mode before checking ->requests. Please see
> * the comment in kvm_vcpu_exiting_guest_mode().
> *
> - * 2) For APICv, we should set ->mode before checking PIR.ON. This
> + * 2) For APICv, we should set ->mode before checking PID.ON. This
> * pairs with the memory barrier implicit in pi_test_and_set_on
> * (see vmx_deliver_posted_interrupt).
> *
>