Re: [PATCH v2] perf stat: Introduce skippable evsels

From: Liang, Kan
Date: Wed Apr 19 2023 - 14:57:20 EST



On 2023-04-19 12:51 p.m., Ian Rogers wrote:
>>>>> With that change I don't have a case that requires skippable evsels,
>>>>> and so we can take that series with patch 6 over the v1 of that series
>>>>> with this change.
>>>>>
>>>>
>>>> I'm afraid this is not the only problem the commit 94b1a603fca7 ("perf
>>>> stat: Add TopdownL1 metric as a default if present") in the
>>>> perf-tools-next branch introduced.
>>>>
>>>> The topdown L2 in the perf stat default on SPR and big core of the ADL
>>>> is still missed. I don't see a possible fix for this on the current
>>>> perf-tools-next branch.
>>>
>>> I thought in its current state the json metrics for TopdownL2 on SPR
>>> have multiplexing. Given L1 is used to drill down to L2, it seems odd
>>> to start on L2, but given L1 is used to compute the thresholds for L2,
>>> this should be to have both L1 and L2 on these platforms. However,
>>> that doesn't work as you don't want multiplexing.
>>>
>>> This all seems backward to avoid potential multiplexing on branch miss
>>> rate and IPC, just always having TopdownL1 seems cleanest with the
>>> skippable evsels working around the permissions issue - as put forward
>>> in this patch. Possibly adding L2 metrics on ADL/SPR, but only once
>>> the multiplexing issue is resolved.
>>>
>>
>> No, not just that issue. Based to what I tested these days, perf stat
>> default has issues/regressions on most of the Intel platforms with the
>> current perf-tools-next and perf/core branch of acme's repo.
>>
>> For the pre-ICL platforms:
>> - The permission issue. (This patch tried to address.)
>> - Unclean perf stat default. (This patch failed to address.)
>> Unnecessary multiplexing for cycles.
>> Display partial of the TopdownL1
>>
>> https://lore.kernel.org/lkml/d1fe801a-22d0-1f9b-b127-227b21635bd5@xxxxxxxxxxxxxxx/
>>
>> For SPR platforms
>> - Topdown L2 metrics is missed, while it works with the current 6.3-rc7.
>>
>> For ADL/RPL platforms
>> - Segmentation fault which I just found this morning.
>> # ./perf stat true
>> Segmentation fault (core dumped)
>
> This may also stem from the reference count checking work that Arnaldo
> is currently merging. It is hard to test hybrid because it uses
> non-generic code paths.

There are two places which causes the Segmentation fault.
One place is the TopdownL1.

After I disable the TopdownL1 and add !counter->name as below, there are
no errors for the ./perf stat true.

(The below is just for test purpose.)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7a641a67486d..8e12ed1141e0 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1892,7 +1892,7 @@ static int add_default_attributes(void)
* Add TopdownL1 metrics if they exist. To minimize
* multiplexing, don't request threshold computation.
*/
- if (metricgroup__has_metric("TopdownL1")) {
+ if (0 && metricgroup__has_metric("TopdownL1")) {
struct evlist *metric_evlist = evlist__new();
struct evsel *metric_evsel;

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6b46bbb3d322..072fa56744b4 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -747,7 +747,7 @@ static void uniquify_event_name(struct evsel *counter)
int ret = 0;

if (counter->uniquified_name || counter->use_config_name ||
- !counter->pmu_name || !strncmp(counter->name, counter->pmu_name,
+ !counter->pmu_name || !counter->name ||
!strncmp(counter->name, counter->pmu_name,
strlen(counter->pmu_name)))
return;

>
>> After the test on a hybrid machine, I incline to revert the commit
>> 94b1a603fca7 ("perf stat: Add TopdownL1 metric as a default if present")
>> and related patches for now.
>>
>> To clarify, I do not object a generic solution for the Topdown on
>> different ARCHs. But the current generic solution aka TopdownL1 has all
>> kinds of problems on most of Intel platforms. We should fix them first
>> before applying to the mainline.
>
> No, 6.3 has many issues as do the default events/metrics:

To be honest, I don't think they are real critical issues. For the first
one, I think there was already a temporary fix. For the others, they are
there for years.

However, the solution you proposed in the huge patch set
(https://lore.kernel.org/lkml/20230219092848.639226-37-irogers@xxxxxxxxxx/)
brings many critical issues on different Intel platforms, crashes,
Missing features, etc.
Also, I was just told that many of our existing tools which on top of
the perf tool will also be broken, because all the annotations of the
kernel top-down metrics event disappeared.

So we really should revert the patches. I don't think patches 39 to 51
are well-tested and reviewed.

Thanks,
Kan

> - in 6.3 aggregation is wrong for metrics, specifically double
> counting happens as the summary gets applied to the saved value under
> certain conditions like repeat runs. This is solved in perf-tools-next
> by removing the duplicated aggregation and using a single set of
> counters. The "shadow" code is both simpler and more correct than in
> 6.3, reverting it would be a massive regression - tbh, if it wasn't
> for the size of the patches I think we should cherry-pick what is in
> perf-tools-next back to 6.3 due to the size of the errors.
> - the default events for IPC and branch miss rate lack groups and so
> are inaccurate when multiplexing happens - but multiplexing inaccuracy
> has historically not been an issue for the default events/metrics.
> - the previous default topdown metrics were inaccurate compared to
> the TMA metrics spreadsheet, specifically they lacked events to
> correct for errors and the thresholds differed. This is why TopdownL2
> is a challenge because the json metrics do things properly causing the
> event groups not to be trivially shared.
> - the topdown event solution is not generic, it is not only Intel
> specific but Intel model specific. Solving the problem via the json is
> the most consistent way to do this and is what I suggest we move to
> in:
> https://lore.kernel.org/lkml/20230419005423.343862-6-irogers@xxxxxxxxxx/
> but you can keep pushing that perf should do something special for
> Intel and for every Intel model. It also pushes a need into the metric
> code to do hybrid specific things so we can ignore atom TopdownL1. In
> this vein, hybrid is a giant mess that the code base needs cleaning up
> from; take the recent patch to just disable generic event parsing
> tests for MTL:
> https://lore.kernel.org/lkml/20230411094330.653965-1-tinghao.zhang@xxxxxxxxx/
> We need to stop doing this, and have generic code, not least because
> of complaints like:
> https://www.kernel.org/doc/html/latest/gpu/i915.html#issues-hit-with-first-prototype-based-on-core-perf
> and because of the crash you are reporting on ADL/RPL.
>
> What has motivated you here is a concern over multiplexing on branch
> miss rate and IPC on pre-Icelake architectures that I find hard to
> understand and is trivially remedied in time honored fashions, ie
> profile the metrics you care about separately. A far more egregious
> multiplexing problem is that on Icelake+ architectures we multiplex
> fixed and generic counters unnecessarily when the same fixed counter
> is needed or there are insufficient generic counters. This is
> egregious as there is no user workaround and the kernel fix, to
> iterate all hardware events the same way we do for all software
> events, is both obvious and trivial. We've lived with this
> multiplexing problem since Icelake and I think we can live with a
> possible multiplexing problem, on legacy architectures, with what is
> in perf-tools-next. The permissions problem tackled here means we
> should merge this.
>
> Thanks,
> Ian
>
>> Thanks,
>> Kan