Re: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

From: Like Xu
Date: Mon Feb 06 2023 - 02:54:00 EST


On 25/1/2023 8:10 am, Sean Christopherson wrote:
On Fri, Nov 11, 2022, Like Xu wrote:
On Fri, Nov 11, 2022, Like Xu wrote:
@@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ struct kvm_cpuid_entry2 *entry;
+ union cpuid_0x80000022_ebx ebx;

- if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
- pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
+ pmu->version = 1;
+ if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) &&

Why check kvm_cpu_cap support? I.e. what will go wrong if userspace enumerates
PMU v2 to the guest without proper hardware/KVM support.

If this is _necessary_ to protect the host kernel, then we should probably have
a helper to query PMU features, e.g.

static __always_inline bool guest_pmu_has(struct kvm_vcpu *vcpu,
unsigned int x86_feature)
{
return kvm_cpu_cap_has(x86_feature) &&
guest_cpuid_has(vcpu, x86_feature);
}



+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) {
+ pmu->version = 2;
+ entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0);
+ ebx.full = entry->ebx;
+ pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc,
+ (unsigned int)kvm_pmu_cap.num_counters_gp,
+ (unsigned int)KVM_AMD_PMC_MAX_GENERIC);

Blech. This really shouldn't be necessary, KVM should tweak kvm_pmu_cap.num_counters_gp
as needed during initialization to ensure num_counters_gp doesn't exceed KVM's
internal limits.

Posted a patch[*], please take a look. As mentioned in that thread, I'll somewhat
speculatively apply that series sooner than later so that you can use it a base
for this series (assuming the patch isn't busted).

[*] https://lore.kernel.org/all/20230124234905.3774678-2-seanjc@xxxxxxxxxx

+ }
+
+ /* Commitment to minimal PMCs, regardless of CPUID.80000022 */

Please expand this comment. I'm still not entirely sure I've interpreted it correctly,
and I'm not sure that I agree with the code.

In the first version [1], I used almost the same if-elif-else sequence
but the concerns from JimM[2] has changed my mind:

"Nonetheless, for compatibility with old software, Fn8000_0022_EBX can never
report less than four counters (or six, if Fn8000_0001_ECX[PerfCtrExtCore] is set)."

Both in amd_pmu_refresh() and in __do_cpuid_func(), KVM implements
this using the override approach of first applying the semantics of
AMD_PMU_V2 and then implementing a minimum number of counters
supported based on whether or not guest have PERFCTR_CORE,
the proposed if-elif-else does not fulfill this need.

[1] 20220905123946.95223-4-likexu@xxxxxxxxxxx/
[2] CALMp9eQObuiJGV=YrAU9Fw+KoXfJtZMJ-KUs-qCOVd+R9zGBpw@xxxxxxxxxxxxxx


+ if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) &&

AFAICT, checking kvm_cpu_cap_has() is an unrelated change. Either it's a bug fix
and belongs in a separate patch, or it's unnecessary and should be dropped.

+ guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+ pmu->nr_arch_gp_counters = max_t(unsigned int,
+ pmu->nr_arch_gp_counters,
+ AMD64_NUM_COUNTERS_CORE);
else
- pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
+ pmu->nr_arch_gp_counters = max_t(unsigned int,
+ pmu->nr_arch_gp_counters,
+ AMD64_NUM_COUNTERS);

Using max() doesn't look right. E.g. if KVM ends up running on some odd setup
where ebx.split.num_core_pmc/kvm_pmu_cap.num_counters_gp is less than
AMD64_NUM_COUNTERS_CORE or AMD64_NUM_COUNTERS.

Or more likely, if userspace says "only expose N counters to this guest".

Shouldn't this be something like?

if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2))
pmu->nr_arch_gp_counters = min(ebx.split.num_core_pmc,
kvm_pmu_cap.num_counters_gp);
else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
else
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERSE;