Re: [PATCH v2] perf stat: Introduce skippable evsels

From: Ian Rogers
Date: Fri Apr 21 2023 - 11:49:56 EST


On Fri, Apr 21, 2023 at 6:32 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023-04-20 8:19 p.m., Ian Rogers wrote:
> >>>> struct evlist *metric_evlist = evlist__new();
> >>>> struct evsel *metric_evsel;
> >>>>
> >>>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >>>> index 6b46bbb3d322..072fa56744b4 100644
> >>>> --- a/tools/perf/util/stat-display.c
> >>>> +++ b/tools/perf/util/stat-display.c
> >>>> @@ -747,7 +747,7 @@ static void uniquify_event_name(struct evsel *counter)
> >>>> int ret = 0;
> >>>>
> >>>> if (counter->uniquified_name || counter->use_config_name ||
> >>>> - !counter->pmu_name || !strncmp(counter->name, counter->pmu_name,
> >>>> + !counter->pmu_name || !counter->name ||
> >>>> !strncmp(counter->name, counter->pmu_name,
> >>>> strlen(counter->pmu_name)))
> >>>> return;
> >>>
> >>> Is this a pre-existing hybrid bug? It is a real shame hybrid shows so
> >>> few common code paths. In general evsel__name should be preferred over
> >>> directly accessing name.
> >>
> >>
> >> I don't think so.
> >>
> >> I haven't dig into the bug yet. But from the source code I can tell that
> >> the check is the same as the current 6.3-rc7.
> >>
> >> For the current 6.3-rc7, perf stat true works.
> >> The perf stat -M TopdownL1 --metric-no-group can work as well.
> >>
> >> But with the current perf-tools-next branch, perf stat true gives a
> >> Segmentation fault.
> >>
> >> The TopdownL1 doesn't work either.
> >>
> >> # ./perf stat -M TopdownL1 --metric-no-group
> >> Error:
> >> The sys_perf_event_open() syscall returned with 22 (Invalid argument)
> >> for event (topdown-retiring).
> >> /bin/dmesg | grep -i perf may provide additional information.
> >
> > I see hybrid failing basic sanity tests both for 6.3 and in
> > perf-tools-next. For metrics I see:
> >
> > ```
> > $ git status
> > ...
> > Your branch is up to date with 'linus/master'
> > ...
> > $ git describe
> > v6.3-rc7-139-gb7bc77e2f2c7
> > $ sudo perf stat -M TopdownL1 -a sleep 1
>
> Try the --metric-no-group.
>
>
> > WARNING: events in group from different hybrid PMUs!
> > WARNING: grouped events cpus do not match, disabling group:
> > anon group { topdown-retiring, topdown-retiring,
> > INT_MISC.UOP_DROPPING, topdown-fe-bound, topdown-fe-bound,
> > CPU_CLK_UNHALTED.CORE, topdown-be-bound, topdown-be-bound,
> > topdown-bad-spec, topdown-bad-spec }
> > Error:
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument)
> > for event (topdown-retiring).
> > /bin/dmesg | grep -i perf may provide additional information.
> > ```
> >
> > It seems perf on hybrid is quite broken in 6.3, but I doubt we can fix
> > 6.3 given the late stage of the release cycle. As perf-tools-next
> > enables TopdownL1 metrics when no events or metric are specified and
> > when the metric group is present, on hybrid this will cause the
> > pre-existing bug to appear for the no events/metrics case. I suspect
> > this is the cause of the crashes you see, but I'm seeing assertion
> > failures and similar as I'm using a debug build.
> >
> > I'm looking into fixing perf-tools-next and not 6.3. Maybe there will
> > be something we can cherry-pick back to fix up 6.3. It hasn't been
> > easy to find hardware to test on, and if the machine I'm remotely
> > using falls over then I have no means to test, so fingers crossed.
> >
>
> OK. So the json metric thing is buggy on both 6.3 and perf-tools-next. I
> think it is even worse in the perf-tools-next.
> Besides the bugs, the json metric also changes the output layout of perf
> stat default. The tools, which are on top of perf, are all impacted.
>
> The question is why we are in such a rush to move the default of perf
> stat from reliable kernel metric to json metric?

So there is a long answer involving description of the multi-year
effort between Intel and Google to transition to json metrics that are
fully tested, etc. but the short answer here is that:
1) the code has always called for json to be the approach:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/builtin-stat.c#n1841
".. Once platform specific metrics support has been added to the json
files, all architectures will use this approach..."
2) aggregation in the saved values used by hard coded metrics, and
json metrics, is broken in 6.3 The bug reporting this came from Intel
around the time 6.2 was wrapping up (ie not a new issue), and Intel
requested a fix. This has been in use for metric testing by Intel. The
fix removed saved values, used by the hard coded metrics, and even if
the hard coded metrics were ported to perf-tools-next which version
should be ported? The version that disagrees with the TMA metric
spreadsheet that is in 6.3, or a version matching the json metric? If
you're just going to match the json metric then why not just use the
json metric? It is less code and cleaner all round. Fixing the hard
coded metric also alters the output formatting and so is unlikely to
resolve what has been an issue here.
3) hard coded metrics have a non-zero cost, in particular in output
formatting. Take for example:
https://lore.kernel.org/all/20221205042852.83382-1-atrajeev@xxxxxxxxxxxxxxxxxx/
"tools/perf: Fix printing field separator in CSV metrics output"
But the perf tool also has had patches in this area from RedHat.
Having the hard coded metrics is error prone, while the json metrics
are something that is far more uniform and tested. We have in the
region of 30 hard coded metrics in Linux 6.3, much fewer in
perf-tools-next, whereas we have more than 4,100 json metrics.

> Can we take a step back? Create a perf/json_metric or whatever branch,
> fix all the issues thoroughly, and then merge to the mainline.
>
> I think the default of perf stat is frequently used by not only the
> newbees but also the veterans. That could have a big user-visible impact.
>
> The 6.4 merge window is approaching. Can we at least revert the patches
> for 6.4?
>
> Arnaldo, what do you think?

Revert is a complete no go, we'd be reintroducing the bugs that are
fixed and who will then fix those? How will they be fixed in any way
other than how they've already been fixed? I've yet to see a
description of a bug that isn't either:
1) an issue because the output formatting changed - the fix here is to
use CSV or json output, using the hard coded metrics was a bug here
anyway, not least as the metrics are wrong,
2) a pre-existing issue, such as hybrid is broken,
3) a non-issue, such as multiplexing on Skylake.
That's not to say everything couldn't run better, which was the issue
this patch was looking to address.

My team have called for this to be discussed with Intel as we're firm
on what we see as being the correct approach. I think the meeting can
conclude what will happen wrt reverts, but I haven't heard a sensible
argument why reverts are necessary. I think the energy from this
conversation would be much better directed into fixing the issues we
know we have, which is why I'm currently looking at fixing some
aspects of what is broken in hybrid.

Thanks,
Ian

> Thanks,
> Kan