Re: [PATCH v2] KVM: x86: Sync the pending Posted-Interrupts

From: Paolo Bonzini
Date: Wed Jan 30 2019 - 06:18:12 EST


On 30/01/19 11:38, Kang, Luwei wrote:
>>>> This is not what I asked. You should instead do the check after pi_clear_sn.
>>>>
>>>
>>> I think the SN has been cleared here before test the bitmap.
>>> The SN will be set when the vCPU is schedule out. ID:
>>> 28b835d60fcc2498e717cf5e6f0c3691c24546f7
>>> But SN will be cleared when sched in.
>>>
>>> Another place is when vCPU run out of the vcpu_run() function:
>>> kvm_arch_vcpu_ioctl_run()
>>> vcpu_load(vcpu); -> kvm_arch_vcpu_load -> vmx_vcpu_load -> vmx_vcpu_pi_load -> new.sn = 0;
>>> vcpu_run(vcpu);
>>> for(;;)
>>> vcpu_put(vcpu); -> kvm_arch_vcpu_put -> vmx_vcpu_put ->
>>> vmx_vcpu_pi_put -> pi_set_sn() But SN will be cleared in vcpu_load()
>>> before back to vcpu_run()
>>
>> Yes, but you're changing the wrong path. The patch is affecting _all_ vmentries, not just those after PID.SN has been cleared.
>>
>> As I mentioned in the previous email, KVM relies on the SDM's invariant that ON where PID.ON=1 whenever PID.PIR!=0. Invariants are your
>> best friend when dealing with complicated multi-processor code so I don't want to change that.
>>
>> It's the VT-d pi_clear_sn path that I want to be changed, because it's VT-d and specifically SN that complicates the very simple definition in
>> the SDM. By modifying the pi_clear_sn path, you ensure the invariant is respected and everyone is happy.
>
> Hi Paolo,
> How about like this:
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 820a03b..dfc5e3d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1219,6 +1219,9 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> new.ndst = (dest << 8) & 0xFF00;
>
> new.sn = 0;
> +
> + if (!bitmap_empty((unsigned long *)new.pir, NR_VECTORS))
> + new.on = 1;

This is racy, the bitmap can change after the cmpxchg64; and the "if"
that calls pi_clear_sn needs to check the bitmap as well.

So you have to remove that "if", and move the check outside the
do/while. You also need an smp_mb__after_atomic() after the do/while.

Paolo

> } while (cmpxchg64(&pi_desc->control, old.control,
> new.control) != old.control);
> }