Re: [PATCH v2 1/7] perf/x86/core: Update x86_pmu.pebs_capable for ICELAKE_{X,D}

From: Like Xu
Date: Mon Aug 15 2022 - 05:43:50 EST


On 15/8/2022 5:31 pm, Peter Zijlstra wrote:
On Fri, Aug 12, 2022 at 09:52:13AM +0200, Paolo Bonzini wrote:
On 7/21/22 12:35, Like Xu wrote:
From: Like Xu <likexu@xxxxxxxxxxx>

Ice Lake microarchitecture with EPT-Friendly PEBS capability also support
the Extended feature, which means that all counters (both fixed function
and general purpose counters) can be used for PEBS events.

Update x86_pmu.pebs_capable like SPR to apply PEBS_ALL semantics.

Cc: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
Fixes: fb358e0b811e ("perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server")
Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
---
arch/x86/events/intel/core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4e9b7af9cc45..e46fd496187b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6239,6 +6239,7 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_ICELAKE_X:
case INTEL_FAM6_ICELAKE_D:
x86_pmu.pebs_ept = 1;
+ x86_pmu.pebs_capable = ~0ULL;
pmem = true;
fallthrough;
case INTEL_FAM6_ICELAKE_L:

Peter, can you please ack this (you were not CCed on this KVM series but
this patch is really perf core)?

I would much rather see something like this; except I don't know if it
is fully correct. I can never find what models support what... Kan do
you know?

For guest PEBS, it's a minor optimization from 0d23dc34a7ce to reduce branch instructions:
https://lore.kernel.org/kvm/YKIqbph62oclxjnt@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/



diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 2db93498ff71..b42c1beb9924 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5933,7 +5933,6 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_aliases = NULL;
x86_pmu.pebs_prec_dist = true;
x86_pmu.lbr_pt_coexist = true;
- x86_pmu.pebs_capable = ~0ULL;
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_PEBS_ALL;
x86_pmu.get_event_constraints = glp_get_event_constraints;
@@ -6291,7 +6290,6 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_aliases = NULL;
x86_pmu.pebs_prec_dist = true;
x86_pmu.pebs_block = true;
- x86_pmu.pebs_capable = ~0ULL;
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
x86_pmu.flags |= PMU_FL_PEBS_ALL;
@@ -6337,7 +6335,6 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_aliases = NULL;
x86_pmu.pebs_prec_dist = true;
x86_pmu.pebs_block = true;
- x86_pmu.pebs_capable = ~0ULL;
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
x86_pmu.flags |= PMU_FL_PEBS_ALL;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index ba60427caa6d..e2da643632b9 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2258,6 +2258,7 @@ void __init intel_ds_init(void)
x86_pmu.drain_pebs = intel_pmu_drain_pebs_icl;
x86_pmu.pebs_record_size = sizeof(struct pebs_basic);
if (x86_pmu.intel_cap.pebs_baseline) {
+ x86_pmu.pebs_capable = ~0ULL;

The two features of "Extended PEBS (about pebs_capable)" and "Adaptive PEBS (about pebs_baseline)"
are orthogonal, although the two are often supported together.

x86_pmu.large_pebs_flags |=
PERF_SAMPLE_BRANCH_STACK |
PERF_SAMPLE_TIME;