Re: [PATCH v2] perf stat: Introduce skippable evsels

From: Ian Rogers
Date: Thu Apr 20 2023 - 20:20:20 EST


On Thu, Apr 20, 2023 at 6:04 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023-04-19 8:23 p.m., Ian Rogers wrote:
> > On Wed, Apr 19, 2023 at 11:57 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >> On 2023-04-19 12:51 p.m., Ian Rogers wrote:
> >>>>>>> With that change I don't have a case that requires skippable evsels,
> >>>>>>> and so we can take that series with patch 6 over the v1 of that series
> >>>>>>> with this change.
> >>>>>>>
> >>>>>>
> >>>>>> I'm afraid this is not the only problem the commit 94b1a603fca7 ("perf
> >>>>>> stat: Add TopdownL1 metric as a default if present") in the
> >>>>>> perf-tools-next branch introduced.
> >>>>>>
> >>>>>> The topdown L2 in the perf stat default on SPR and big core of the ADL
> >>>>>> is still missed. I don't see a possible fix for this on the current
> >>>>>> perf-tools-next branch.
> >>>>>
> >>>>> I thought in its current state the json metrics for TopdownL2 on SPR
> >>>>> have multiplexing. Given L1 is used to drill down to L2, it seems odd
> >>>>> to start on L2, but given L1 is used to compute the thresholds for L2,
> >>>>> this should be to have both L1 and L2 on these platforms. However,
> >>>>> that doesn't work as you don't want multiplexing.
> >>>>>
> >>>>> This all seems backward to avoid potential multiplexing on branch miss
> >>>>> rate and IPC, just always having TopdownL1 seems cleanest with the
> >>>>> skippable evsels working around the permissions issue - as put forward
> >>>>> in this patch. Possibly adding L2 metrics on ADL/SPR, but only once
> >>>>> the multiplexing issue is resolved.
> >>>>>
> >>>>
> >>>> No, not just that issue. Based to what I tested these days, perf stat
> >>>> default has issues/regressions on most of the Intel platforms with the
> >>>> current perf-tools-next and perf/core branch of acme's repo.
> >>>>
> >>>> For the pre-ICL platforms:
> >>>> - The permission issue. (This patch tried to address.)
> >>>> - Unclean perf stat default. (This patch failed to address.)
> >>>> Unnecessary multiplexing for cycles.
> >>>> Display partial of the TopdownL1
> >>>>
> >>>> https://lore.kernel.org/lkml/d1fe801a-22d0-1f9b-b127-227b21635bd5@xxxxxxxxxxxxxxx/
> >>>>
> >>>> For SPR platforms
> >>>> - Topdown L2 metrics is missed, while it works with the current 6.3-rc7.
> >>>>
> >>>> For ADL/RPL platforms
> >>>> - Segmentation fault which I just found this morning.
> >>>> # ./perf stat true
> >>>> Segmentation fault (core dumped)
> >>>
> >>> This may also stem from the reference count checking work that Arnaldo
> >>> is currently merging. It is hard to test hybrid because it uses
> >>> non-generic code paths.
> >>
> >> There are two places which causes the Segmentation fault.
> >> One place is the TopdownL1.
> >>
> >> After I disable the TopdownL1 and add !counter->name as below, there are
> >> no errors for the ./perf stat true.
> >>
> >> (The below is just for test purpose.)
> >> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> >> index 7a641a67486d..8e12ed1141e0 100644
> >> --- a/tools/perf/builtin-stat.c
> >> +++ b/tools/perf/builtin-stat.c
> >> @@ -1892,7 +1892,7 @@ static int add_default_attributes(void)
> >> * Add TopdownL1 metrics if they exist. To minimize
> >> * multiplexing, don't request threshold computation.
> >> */
> >> - if (metricgroup__has_metric("TopdownL1")) {
> >> + if (0 && metricgroup__has_metric("TopdownL1")) {
> >
> > So hybrid has something different that causes this. Can you provide
> > the information to solve?
>
> Here is the backtrace.
>
> #0 get_group_fd (thread=0, cpu_map_idx=<optimized out>,
> evsel=0x555556015af0) at util/evsel.c:1722
> #1 evsel__open_cpu (evsel=<optimized out>, cpus=<optimized out>,
> threads=<optimized out>,
> start_cpu_map_idx=<optimized out>, end_cpu_map_idx=<optimized out>)
> at util/evsel.c:2105
> #2 0x000055555561dd9e in __run_perf_stat (run_idx=<optimized out>,
> argv=0x7fffffffe1d0, argc=1)
> at builtin-stat.c:734
> #3 run_perf_stat (run_idx=<optimized out>, argv=0x7fffffffe1d0, argc=1)
> at builtin-stat.c:949
> #4 cmd_stat (argc=1, argv=0x7fffffffe1d0) at builtin-stat.c:2537
> #5 0x00005555556b56a0 in run_builtin (p=p@entry=0x555555f84450
> <commands+336>, argc=argc@entry=2,
> argv=argv@entry=0x7fffffffe1d0) at perf.c:323
> #6 0x00005555555fe2d9 in handle_internal_command (argv=0x7fffffffe1d0,
> argc=2) at perf.c:377
> #7 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at
> perf.c:421
> #8 main (argc=2, argv=0x7fffffffe1d0) at perf.c:537
>
> >
> >> struct evlist *metric_evlist = evlist__new();
> >> struct evsel *metric_evsel;
> >>
> >> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >> index 6b46bbb3d322..072fa56744b4 100644
> >> --- a/tools/perf/util/stat-display.c
> >> +++ b/tools/perf/util/stat-display.c
> >> @@ -747,7 +747,7 @@ static void uniquify_event_name(struct evsel *counter)
> >> int ret = 0;
> >>
> >> if (counter->uniquified_name || counter->use_config_name ||
> >> - !counter->pmu_name || !strncmp(counter->name, counter->pmu_name,
> >> + !counter->pmu_name || !counter->name ||
> >> !strncmp(counter->name, counter->pmu_name,
> >> strlen(counter->pmu_name)))
> >> return;
> >
> > Is this a pre-existing hybrid bug? It is a real shame hybrid shows so
> > few common code paths. In general evsel__name should be preferred over
> > directly accessing name.
>
>
> I don't think so.
>
> I haven't dig into the bug yet. But from the source code I can tell that
> the check is the same as the current 6.3-rc7.
>
> For the current 6.3-rc7, perf stat true works.
> The perf stat -M TopdownL1 --metric-no-group can work as well.
>
> But with the current perf-tools-next branch, perf stat true gives a
> Segmentation fault.
>
> The TopdownL1 doesn't work either.
>
> # ./perf stat -M TopdownL1 --metric-no-group
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument)
> for event (topdown-retiring).
> /bin/dmesg | grep -i perf may provide additional information.

I see hybrid failing basic sanity tests both for 6.3 and in
perf-tools-next. For metrics I see:

```
$ git status
...
Your branch is up to date with 'linus/master'
...
$ git describe
v6.3-rc7-139-gb7bc77e2f2c7
$ sudo perf stat -M TopdownL1 -a sleep 1
WARNING: events in group from different hybrid PMUs!
WARNING: grouped events cpus do not match, disabling group:
anon group { topdown-retiring, topdown-retiring,
INT_MISC.UOP_DROPPING, topdown-fe-bound, topdown-fe-bound,
CPU_CLK_UNHALTED.CORE, topdown-be-bound, topdown-be-bound,
topdown-bad-spec, topdown-bad-spec }
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (topdown-retiring).
/bin/dmesg | grep -i perf may provide additional information.
```

It seems perf on hybrid is quite broken in 6.3, but I doubt we can fix
6.3 given the late stage of the release cycle. As perf-tools-next
enables TopdownL1 metrics when no events or metric are specified and
when the metric group is present, on hybrid this will cause the
pre-existing bug to appear for the no events/metrics case. I suspect
this is the cause of the crashes you see, but I'm seeing assertion
failures and similar as I'm using a debug build.

I'm looking into fixing perf-tools-next and not 6.3. Maybe there will
be something we can cherry-pick back to fix up 6.3. It hasn't been
easy to find hardware to test on, and if the machine I'm remotely
using falls over then I have no means to test, so fingers crossed.

Thanks,
Ian

> >
> >>>
> >>>> After the test on a hybrid machine, I incline to revert the commit
> >>>> 94b1a603fca7 ("perf stat: Add TopdownL1 metric as a default if present")
> >>>> and related patches for now.
> >>>>
> >>>> To clarify, I do not object a generic solution for the Topdown on
> >>>> different ARCHs. But the current generic solution aka TopdownL1 has all
> >>>> kinds of problems on most of Intel platforms. We should fix them first
> >>>> before applying to the mainline.
> >>>
> >>> No, 6.3 has many issues as do the default events/metrics:
> >>
> >> To be honest, I don't think they are real critical issues. For the first
> >> one, I think there was already a temporary fix. For the others, they are
> >> there for years.
> >
> > This isn't true. The aggregation bug was raised to me by Intel and
> > stems from aggregation refactoring in per-thread mode done by
> > Namhyung.
> >
> >> However, the solution you proposed in the huge patch set
> >> (https://lore.kernel.org/lkml/20230219092848.639226-37-irogers@xxxxxxxxxx/)
> >> brings many critical issues on different Intel platforms, crashes,
> >> Missing features, etc.
> >> Also, I was just told that many of our existing tools which on top of
> >> the perf tool will also be broken, because all the annotations of the
> >> kernel top-down metrics event disappeared.
> >>
> >> So we really should revert the patches. I don't think patches 39 to 51
> >> are well-tested and reviewed.
> >
> > The only issue I'm aware of is that hard coded use of the inaccurate
> > hard coded metrics now needs to switch to json metrics. This seems a
> > worthwhile update anyway, and not one that would justify breaking perf
> > stat metrics.
> >
>
> I have pointed out many issues (crash, missing features, user-visible
> changes which breaking existing tools) many times in the previous reply,
> I will not repeat the details again here.
>
> The json metrics in the current perf-tools-next branch which you want to
> switch to also has many bugs based on my recent test.
> - The TopdownL2 triggers unnecessary multiplexing on SPR.
> - Doesn't work on hybrid platforms while the current 6.3-rc7 can.
>
> The difference between the hard coded kernel metrics and the json
> metrics is that the json metrics just collect two more events, e.g., on
> Tigerlake. The core metrics is exactly the same. It's good enough for a
> initial judgement with core metrics.
>
> I don't think the switching is a good choice and necessary based on all
> of the above issues.
>
> Thanks,
> Kan