Re: [PATCH v2] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt

From: Sean Christopherson
Date: Tue Dec 07 2021 - 18:23:12 EST


On Wed, Nov 24, 2021, Aili Yao wrote:
> When cpu-pm is successfully enabled, and hlt_in_guest is true and
> mwait_in_guest is false, the guest cant't use Monitor/Mwait instruction
> for idle operation, instead, the guest may use halt for that purpose, as
> we have enable the cpu-pm feature and hlt_in_guest is true, we will also
> minimize the guest exit; For such a scenario, Monitor/Mwait instruction
> support is totally disabled, the guest has no way to use Mwait to exit from
> non-root mode;
>
> For cpu-pm feature, hlt_in_guest and others except mwait_in_guest will
> be a good hint for it. So replace it with hlt_in_guest.

This should be a separate patch from the housekeeping_cpu() check, if we add
the housekeeping check.

> Signed-off-by: Aili Yao <yaoaili@xxxxxxxxxxxx>
> ---
> arch/x86/kvm/lapic.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..42aef1accd6b 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -34,6 +34,7 @@
> #include <asm/delay.h>
> #include <linux/atomic.h>
> #include <linux/jump_label.h>
> +#include <linux/sched/isolation.h>
> #include "kvm_cache_regs.h"
> #include "irq.h"
> #include "ioapic.h"
> @@ -113,13 +114,14 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
>
> static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> {
> - return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> + return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> + !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);

Why not check kvm_{hlt,mwait}_in_guest()? IIUC, non-housekeeping CPUs don't _have_
to be associated 1:1 with a vCPU, in which case posting the timer is unlikely
to be a performance win even though the target isn't a housekeeping CPU.

And wouldn't exposing HLT/MWAIT to a vCPU that's on a housekeeping CPU be a bogus
configuration?

> }
>
> bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
> {
> return kvm_x86_ops.set_hv_timer
> - && !(kvm_mwait_in_guest(vcpu->kvm) ||
> + && !(kvm_hlt_in_guest(vcpu->kvm) ||

This is incorrect, the HLT vs. MWAIT isn't purely a posting interrupts thing. The
VMX preemption timer counts down in C0, C1, and C2, but not deeper sleep states.
HLT is always C1, thus it's safe to use the VMX preemption timer even if the guest
can execute HLT without exiting.

The timer isn't compatible with MWAIT because it stops counting in C3 (or lower),
i.e. the guest can cause the timer to stop counting.

> kvm_can_post_timer_interrupt(vcpu));
> }
> EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
> --

Splicing in Wanpeng's version to try and merge the two threads:

On Tue, Nov 23, 2021 at 10:00 PM Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
> ---
> arch/x86/kvm/lapic.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..8257566d44c7 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
>
> static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> {
> - return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> + return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) && kvm_vcpu_apicv_active(vcpu);

As Aili's changelog pointed out, MWAIT may not be advertised to the guest.

So I think we want this? With a non-functional, opinionated refactoring of
kvm_can_use_hv_timer() because I'm terrible at reading !(a || b).

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 40270d7bc597..c77cb386d03d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -113,14 +113,25 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)

static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
{
- return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
+ return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
+ (kvm_mwait_in_guest(vcpu) || kvm_hlt_in_guest(vcpu));
}

bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
{
- return kvm_x86_ops.set_hv_timer
- && !(kvm_mwait_in_guest(vcpu->kvm) ||
- kvm_can_post_timer_interrupt(vcpu));
+ /*
+ * Don't use the hypervisor timer, a.k.a. VMX Preemption Timer, if the
+ * guest can execute MWAIT without exiting as the timer will stop
+ * counting if the core enters C3 or lower. HLT in the guest is ok as
+ * HLT is effectively C1 and the timer counts in C0, C1, and C2.
+ *
+ * Don't use the hypervisor timer if KVM can post a timer interrupt to
+ * the guest since posted the timer avoids taking an extra a VM-Exit
+ * when the timer expires.
+ */
+ return kvm_x86_ops.set_hv_timer &&
+ !kvm_mwait_in_guest(vcpu->kvm) &&
+ !kvm_can_post_timer_interrupt(vcpu));
}
EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);