Re: [PATCH v4 00/44] Fix perf on Intel hybrid CPUs

From: Arnaldo Carvalho de Melo
Date: Fri May 12 2023 - 14:31:13 EST


Em Tue, May 02, 2023 at 03:38:07PM -0700, Ian Rogers escreveu:
> TL;DR: hybrid doesn't crash, json metrics work on hybrid on both PMUs
> or individually, event parsing doesn't always scan all PMUs, more and
> new tests that also run without hybrid, less code.
>
> The first 4 patches are aimed at Linux 6.4 to address issues raised,
> in particular by Kan, on the existing perf stat behavior with json
> metrics. They avoid duplicated events by removing groups. They don't
> hide events and metrics to make event multiplexing obvious. They avoid
> terminating perf when paranoia is higher due to certain events that
> always fail. They avoid rearranging events by PMUs when the events
> aren't in a group.

Took the first 4, skipped patch 19 that Adrian asked to be in 6.4 and
then went all the way to:

Applying: perf test: Add cputype testing to perf stat
error: patch failed: tools/perf/tests/shell/stat.sh:91
error: tools/perf/tests/shell/stat.sh: patch does not apply
Patch failed at 0012 perf test: Add cputype testing to perf stat
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
⬢[acme@toolbox perf-tools-next]$ git am --abort
⬢[acme@toolbox perf-tools-next]$

Will test with it and put what I have in tmp.perf-tools-next, testing
now.

- Arnaldo


> The next 5 patches avoid grouping events for metrics where they could
> never succeed and were previously posted as:
> "perf vendor events intel: Add xxx metric constraints"
> https://lore.kernel.org/all/20230419005423.343862-1-irogers@xxxxxxxxxx/
> In general the generated json is coming from:
> https://github.com/intel/perfmon/pull/73
>
> Next are some general and test improvements.
>
> Next event parsing is rewritten to not scan all PMUs for the benefit
> of raw and legacy cache parsing, instead these are handled by the
> lexer and a new term type. This ultimately removes the need for the
> event parser for hybrid to be recursive as legacy cache can be just a
> term. Tests are re-enabled for events with hyphens, so AMD's
> branch-brs event is now parsable.
>
> The cputype option is made a generic pmu filter flag and is tested
> even on non-hybrid systems.
>
> The final patches address specific json metric issues on hybrid, in
> both the json metrics and the metric code.
>
> The patches add slightly more code than they remove, in areas like
> better json metric constraints and tests, but in the core util code,
> the removal of hybrid is a net reduction:
> 22 files changed, 711 insertions(+), 1016 deletions(-)
>
> Sample output is contained in the v1 patch set:
> https://lore.kernel.org/lkml/bff481ba-e60a-763f-0aa0-3ee53302c480@xxxxxxxxxxxxxxx/
>
> Tested on Tigerlake, Skylake and Alderlake CPUs.
>
> The v4 patch set:
> - rebase, 1 of the Linux 6.4 recommended patches are merged leaving:
> 1) perf metric: Change divide by zero and !support events behavior
> 2) perf stat: Introduce skippable evsels
> 3) perf metric: Json flag to not group events if gathering a metric group
> 4) perf parse-events: Don't reorder ungrouped events by pmu
> whose diffstat is:
> 30 files changed, 326 insertions(+), 33 deletions(-)
> but without the vendor event updates (the tend to be large as they
> repeat something per architecture per metric) is just:
> 10 files changed, 90 insertions(+), 32 deletions(-)
> - Address review comments from Ravi Bangoria <ravi.bangoria@xxxxxxx>
> that are only in the patches targeting Linux 6.5.
>
> The v3 patch set:
> - for Linux 6.4 the first 5 patches are recommended:
> - changes skippable evsels to always print in case short running
> benchmarks meet the 0 enable and 0 count condition (suggested by
> Stephane).
> - changes metrics to show "nan" so that it is more obvious they
> couldn't be computed (suggested by Stephane).
> - fixes a reordering issue that reordered events that lacked a
> group, especially when the core PMU isn't "cpu".
> - for Linux 6.5 added extra hybrid type tests in the x86 hybrid test
> as suggested by Kan.
> - note, the patches aren't rebased against the tmp.perf-tools branch
> meaning patches 1 and 11 should be dropped there.
>
> The v2 patch set:
> - restructures the 3 Linux 6.4 patches first;
> - makes it so that is_event_supported isn't called during core event parsing;
> - displays skippable 0 count events that were enabled;
> - addresses output formatting review comments;
> - addresses some test issues and an uninitialized variable use in strchr;
> - addresses checkpatch.pl reported issues;
> - adds reviewed-by to some patches.
>
> Ian Rogers (44):
> perf metric: Change divide by zero and !support events behavior
> perf stat: Introduce skippable evsels
> perf metric: Json flag to not group events if gathering a metric group
> perf parse-events: Don't reorder ungrouped events by pmu
> perf vendor events intel: Add alderlake metric constraints
> perf vendor events intel: Add icelake metric constraints
> perf vendor events intel: Add icelakex metric constraints
> perf vendor events intel: Add sapphirerapids metric constraints
> perf vendor events intel: Add tigerlake metric constraints
> perf test: Test more sysfs events
> perf test: Use valid for PMU tests
> perf test: Mask configs with extended types then test
> perf test: Test more with config_cache
> perf test: Roundtrip name, don't assume 1 event per name
> perf parse-events: Set attr.type to PMU type early
> perf parse-events: Set pmu_name whenever a pmu is given
> perf print-events: Avoid unnecessary strlist
> perf parse-events: Avoid scanning PMUs before parsing
> perf evsel: Modify group pmu name for software events
> perf test: Move x86 hybrid tests to arch/x86
> perf test x86 hybrid: Update test expectations
> perf test x86 hybrid: Add hybrid extended type checks
> perf parse-events: Support PMUs for legacy cache events
> perf parse-events: Wildcard legacy cache events
> perf print-events: Print legacy cache events for each PMU
> perf parse-events: Support wildcards on raw events
> perf parse-events: Remove now unused hybrid logic
> perf parse-events: Minor type safety cleanup
> perf parse-events: Add pmu filter
> perf stat: Make cputype filter generic
> perf test: Add cputype testing to perf stat
> perf test: Fix parse-events tests for >1 core PMU
> perf parse-events: Support hardware events as terms
> perf parse-events: Avoid error when assigning a term
> perf parse-events: Avoid error when assigning a legacy cache term
> perf parse-events: Don't auto merge hybrid wildcard events
> perf parse-events: Don't reorder atom cpu events
> perf metrics: Be PMU specific for referenced metrics.
> perf stat: Command line PMU metric filtering
> perf vendor events intel: Correct alderlake metrics
> perf jevents: Don't rewrite metrics across PMUs
> perf metrics: Be PMU specific in event match
> perf stat: Don't disable TopdownL1 metric on hybrid
> perf parse-events: Reduce scope of is_event_supported
>
> tools/perf/arch/x86/include/arch-tests.h | 1 +
> tools/perf/arch/x86/tests/Build | 1 +
> tools/perf/arch/x86/tests/arch-tests.c | 10 +
> tools/perf/arch/x86/tests/hybrid.c | 287 ++++++
> tools/perf/arch/x86/util/evlist.c | 4 +-
> tools/perf/builtin-list.c | 21 +-
> tools/perf/builtin-record.c | 13 +-
> tools/perf/builtin-stat.c | 77 +-
> tools/perf/builtin-top.c | 5 +-
> tools/perf/builtin-trace.c | 5 +-
> .../arch/x86/alderlake/adl-metrics.json | 275 +++---
> .../arch/x86/alderlaken/adln-metrics.json | 20 +-
> .../arch/x86/broadwell/bdw-metrics.json | 12 +
> .../arch/x86/broadwellde/bdwde-metrics.json | 12 +
> .../arch/x86/broadwellx/bdx-metrics.json | 12 +
> .../arch/x86/cascadelakex/clx-metrics.json | 12 +
> .../arch/x86/haswell/hsw-metrics.json | 12 +
> .../arch/x86/haswellx/hsx-metrics.json | 12 +
> .../arch/x86/icelake/icl-metrics.json | 23 +
> .../arch/x86/icelakex/icx-metrics.json | 23 +
> .../arch/x86/ivybridge/ivb-metrics.json | 12 +
> .../arch/x86/ivytown/ivt-metrics.json | 12 +
> .../arch/x86/jaketown/jkt-metrics.json | 12 +
> .../arch/x86/sandybridge/snb-metrics.json | 12 +
> .../arch/x86/sapphirerapids/spr-metrics.json | 23 +
> .../arch/x86/skylake/skl-metrics.json | 12 +
> .../arch/x86/skylakex/skx-metrics.json | 12 +
> .../arch/x86/tigerlake/tgl-metrics.json | 23 +
> tools/perf/pmu-events/jevents.py | 10 +-
> tools/perf/pmu-events/metric.py | 28 +-
> tools/perf/pmu-events/metric_test.py | 6 +-
> tools/perf/pmu-events/pmu-events.h | 2 +
> tools/perf/tests/evsel-roundtrip-name.c | 119 +--
> tools/perf/tests/expr.c | 3 +-
> tools/perf/tests/parse-events.c | 858 +++++++++---------
> tools/perf/tests/parse-metric.c | 1 +
> tools/perf/tests/pmu-events.c | 12 +-
> tools/perf/tests/shell/stat.sh | 44 +
> tools/perf/util/Build | 1 -
> tools/perf/util/evlist.h | 1 -
> tools/perf/util/evsel.c | 30 +-
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/expr.y | 6 +-
> tools/perf/util/metricgroup.c | 111 ++-
> tools/perf/util/metricgroup.h | 3 +-
> tools/perf/util/parse-events-hybrid.c | 214 -----
> tools/perf/util/parse-events-hybrid.h | 25 -
> tools/perf/util/parse-events.c | 720 ++++++---------
> tools/perf/util/parse-events.h | 63 +-
> tools/perf/util/parse-events.l | 108 +--
> tools/perf/util/parse-events.y | 222 ++---
> tools/perf/util/pmu-hybrid.c | 20 -
> tools/perf/util/pmu-hybrid.h | 1 -
> tools/perf/util/pmu.c | 16 +-
> tools/perf/util/pmu.h | 3 +
> tools/perf/util/pmus.c | 25 +-
> tools/perf/util/pmus.h | 3 +
> tools/perf/util/print-events.c | 127 ++-
> tools/perf/util/stat-display.c | 2 +-
> tools/perf/util/stat-shadow.c | 25 +-
> 60 files changed, 2066 insertions(+), 1699 deletions(-)
> create mode 100644 tools/perf/arch/x86/tests/hybrid.c
> delete mode 100644 tools/perf/util/parse-events-hybrid.c
> delete mode 100644 tools/perf/util/parse-events-hybrid.h
>
> --
> 2.40.1.495.gc816e09b53d-goog
>

--

- Arnaldo