Re: [PATCH] KVM: x86/svm/pmu: Set PerfMonV2 global control bits correctly

From: Mi, Dapeng
Date: Mon Mar 04 2024 - 02:59:27 EST



On 3/1/2024 5:00 PM, Sandipan Das wrote:
On 3/1/2024 2:07 PM, Like Xu wrote:
On 1/3/2024 3:50 pm, Sandipan Das wrote:
With PerfMonV2, a performance monitoring counter will start operating
only when both the PERF_CTLx enable bit as well as the corresponding
PerfCntrGlobalCtl enable bit are set.

When the PerfMonV2 CPUID feature bit (leaf 0x80000022 EAX bit 0) is set
for a guest but the guest kernel does not support PerfMonV2 (such as
kernels older than v5.19), the guest counters do not count since the
PerfCntrGlobalCtl MSR is initialized to zero and the guest kernel never
writes to it.
If the vcpu has the PerfMonV2 feature, it should not work the way legacy
PMU does. Users need to use the new driver to operate the new hardware,
don't they ? One practical approach is that the hypervisor should not set
the PerfMonV2 bit for this unpatched 'v5.19' guest.

My understanding is that the legacy method of managing the counters should
still work because the enable bits in PerfCntrGlobalCtl are expected to be
set. The AMD PPR does mention that the PerfCntrEn bitfield of PerfCntrGlobalCtl
is set to 0x3f after a system reset. That way, the guest kernel can use either


If so, please add the PPR description here as comments.


the new or legacy method.

This is not observed on bare-metal as the default value of the
PerfCntrGlobalCtl MSR after a reset is 0x3f (assuming there are six
counters) and the counters can still be operated by using the enable
bit in the PERF_CTLx MSRs. Replicate the same behaviour in guests for
compatibility with older kernels.

Before:

   $ perf stat -e cycles:u true

    Performance counter stats for 'true':

                    0      cycles:u

          0.001074773 seconds time elapsed

          0.001169000 seconds user
          0.000000000 seconds sys

After:

   $ perf stat -e cycles:u true

    Performance counter stats for 'true':

              227,850      cycles:u

          0.037770758 seconds time elapsed

          0.000000000 seconds user
          0.037886000 seconds sys

Reported-by: Babu Moger <babu.moger@xxxxxxx>
Fixes: 4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
Signed-off-by: Sandipan Das <sandipan.das@xxxxxxx>
---
  arch/x86/kvm/svm/pmu.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index b6a7ad4d6914..14709c564d6a 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -205,6 +205,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
      if (pmu->version > 1) {
          pmu->global_ctrl_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1);
          pmu->global_status_mask = pmu->global_ctrl_mask;
+        pmu->global_ctrl = ~pmu->global_ctrl_mask;


It seems to be more easily understand to calculate global_ctrl firstly and then derive the globol_ctrl_mask (negative logic).

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index e886300f0f97..7ac9b080aba6 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -199,7 +199,8 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
kvm_pmu_cap.num_counters_gp);

        if (pmu->version > 1) {
-               pmu->global_ctrl_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1);
+               pmu->global_ctrl = (1ull << pmu->nr_arch_gp_counters) - 1;
+               pmu->global_ctrl_mask = ~pmu->global_ctrl;
                pmu->global_status_mask = pmu->global_ctrl_mask;
        }

      }
        pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << 48) - 1;