Re: [PATCH 3/3] KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh

From: Sean Christopherson
Date: Fri Jul 29 2022 - 13:28:25 EST


On Fri, Jul 29, 2022, Like Xu wrote:
> On 28/7/2022 7:34 am, Sean Christopherson wrote:
> > guest_cpuid_has() is expensive due to the linear search of guest CPUID
> > entries, intel_pmu_lbr_is_enabled() is checked on every VM-Enter,_and_
> > simply enumerating the same "Model" as the host causes KVM to set the
> > number of LBR records to a non-zero value.
>
> Before reconsidering vcpu->arch.perf_capabilities to reach a conclusion,
> how about this minor inline change help reduce my sins ?
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 0ecbbae42976..06a21d66be13 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7039,7 +7039,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> pt_guest_enter(vmx);
>
> atomic_switch_perf_msrs(vmx);
> - if (intel_pmu_lbr_is_enabled(vcpu))
> + if (vmx->lbr_desc.records.nr &&
> + (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT))

That doesn't do the right thing if X86_FEATURE_PDCM is cleared in guest CPUID.
It doesn't even require odd userspace behavior since intel_pmu_init() does:

vcpu->arch.perf_capabilities = vmx_get_perf_capabilities();

E.g. older userspace that doesn't set MSR_IA32_PERF_CAPABILITIES will clear PDCM
without touching the vCPU's MSR value.

In the unlikely scenario we can't figure out a solution for PERF_CAPABILITIES,
the alternative I tried first is to implement a generic CPUID feature "caching"
scheme and use it to expedite the PDCM lookup. I scrapped that approach when I
realized that KVM really should be able to consume PERF_CAPABILITIES during PMU
refresh.

I'm hesitant to even suggest a generic caching implementation because I suspect
most performance critical uses of guest CPUID will be similar to PDMC, i.e. can
be captured during KVM_SET_CPUID2 without requiring an explicit cache. And for
PERF_CAPABILITIES, IMO a robust implementation is a must have, i.e. we've failed
if we can't handle it during PMU refresh.