Re: [RFC PATCH 2/2] Boost vCPUs based on IPI-sender and receiver information

From: Sean Christopherson
Date: Wed Apr 21 2021 - 12:16:43 EST


On Thu, Apr 22, 2021, Kenta Ishiguro wrote:
> This commit monitors IPI communication between vCPUs and leverages the
> relationship between vCPUs to select boost candidates.
>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Signed-off-by: Kenta Ishiguro <kentaishiguro@xxxxxxxxxxxxxxxxxxxx>
> ---
> arch/x86/kvm/lapic.c | 14 ++++++++++++++
> arch/x86/kvm/vmx/vmx.c | 2 ++
> include/linux/kvm_host.h | 5 +++++
> virt/kvm/kvm_main.c | 26 ++++++++++++++++++++++++--
> 4 files changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index cc369b9ad8f1..c8d967ddecf9 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1269,6 +1269,18 @@ void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
> }
> EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
>
> +static void mark_ipi_receiver(struct kvm_lapic *apic, struct kvm_lapic_irq *irq)
> +{
> + struct kvm_vcpu *dest_vcpu;
> + u64 prev_ipi_received;
> +
> + dest_vcpu = kvm_get_vcpu_by_id(apic->vcpu->kvm, irq->dest_id);
> + if (READ_ONCE(dest_vcpu->sched_outed)) {

dest_vcpu needs to be checked for NULL.

> + prev_ipi_received = READ_ONCE(dest_vcpu->ipi_received);
> + WRITE_ONCE(dest_vcpu->ipi_received, prev_ipi_received | (1 << apic->vcpu->vcpu_id));
> + }
> +}
> +
> void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
> {
> struct kvm_lapic_irq irq;
> @@ -1287,6 +1299,8 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
>
> trace_kvm_apic_ipi(icr_low, irq.dest_id);
>
> + mark_ipi_receiver(apic, &irq);
> +
> kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
> }
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 29b40e092d13..ced50935a38b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6718,6 +6718,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> vmcs_write32(PLE_WINDOW, vmx->ple_window);
> }
>
> + WRITE_ONCE(vcpu->ipi_received, 0);

Given that ipi_received is cleared when the vCPU is run, is there actually an
observable advantage to tracking which vCPU sent the IPI? I.e. how do the
numbers look if ipi_received is a simple bool, and kvm_vcpu_on_spin() yields to
any vCPU that has an IPI pending?

> /*
> * We did this in prepare_switch_to_guest, because it needs to
> * be within srcu_read_lock.

...

> @@ -4873,6 +4894,7 @@ static void kvm_sched_out(struct preempt_notifier *pn,
> WRITE_ONCE(vcpu->preempted, true);
> WRITE_ONCE(vcpu->ready, true);
> }
> + WRITE_ONCE(vcpu->sched_outed, true);

s/sched_outed/scheduled_out to be more grammatically correct.

It might also make sense to introduce the flag in a separate path. Or even
better, can the existing "preempted" and "ready" be massaged so that we don't
have three flags that are tracking the same basic thing, with slightly different
semantics?

> kvm_arch_vcpu_put(vcpu);
> __this_cpu_write(kvm_running_vcpu, NULL);
> }
> --
> 2.30.2
>