Re: [PATCH v2] perf stat: Introduce skippable evsels

From: Ian Rogers
Date: Tue Apr 18 2023 - 21:01:22 EST


On Tue, Apr 18, 2023 at 5:12 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Tue, Apr 18, 2023 at 2:51 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >
> >
> >
> > On 2023-04-18 4:08 p.m., Ian Rogers wrote:
> > > On Tue, Apr 18, 2023 at 11:19 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 2023-04-18 11:43 a.m., Ian Rogers wrote:
> > >>> On Tue, Apr 18, 2023 at 6:03 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 2023-04-17 2:13 p.m., Ian Rogers wrote:
> > >>>>> The json TopdownL1 is enabled if present unconditionally for perf stat
> > >>>>> default. Enabling it on Skylake has multiplexing as TopdownL1 on
> > >>>>> Skylake has multiplexing unrelated to this change - at least on the
> > >>>>> machine I was testing on. We can remove the metric group TopdownL1 on
> > >>>>> Skylake so that we don't enable it by default, there is still the
> > >>>>> group TmaL1. To me, disabling TopdownL1 seems less desirable than
> > >>>>> running with multiplexing - previously to get into topdown analysis
> > >>>>> there has to be knowledge that "perf stat -M TopdownL1" is the way to
> > >>>>> do this.
> > >>>>
> > >>>> To be honest, I don't think it's a good idea to remove the TopdownL1. We
> > >>>> cannot remove it just because the new way cannot handle it. The perf
> > >>>> stat default works well until 6.3-rc7. It's a regression issue of the
> > >>>> current perf-tools-next.
> > >>>
> > >>> I'm not so clear it is a regression to consistently add TopdownL1 for
> > >>> all architectures supporting it.
> > >>
> > >>
> > >> Breaking the perf stat default is a regression.
> > >
> > > Breaking is overstating the use of multiplexing. The impact is less
> > > accuracy in the IPC and branch misses default metrics,
> >
> > Inaccuracy is a breakage for the default.
>
> Can you present a case where this matters? The events are already not
> grouped and so inaccurate for metrics.

Removing CPUs without perf metrics from the TopdownL1 metric group is
implemented here:
https://lore.kernel.org/lkml/20230419005423.343862-6-irogers@xxxxxxxxxx/
Note, this applies to pre-Icelake and atom CPUs as these also lack
perf metric (aka topdown) events.

With that change I don't have a case that requires skippable evsels,
and so we can take that series with patch 6 over the v1 of that series
with this change.

Thanks,
Ian

> > > if multiplexing
> > > is necessary on your Intel architecture. I believe TopdownL1 is more
> > > useful than either of these metrics and so having TopdownL1 be a
> > > default is an improvement. We can add a patch, as I keep repeating
> > > this is off-topic for this patch, to make it so that TopdownL1 isn't
> > > enabled on Intel CPUs pre-Icelake, but I would discourage this.
> >
> >
> > We need the TopdownL1. We just don't need TopdownL1 in the perf stat
> > default when perf metrics feature is not available.
>
> Perf metrics is an Intel only Icelake+ feature. I suggest the simplest
> way to achieve this would be to remove the TopdownL1 metric group from
> all Intel metrics before Icelake. This will mean on these
> architectures the group TmaL1 will need using instead.
>
> Thanks,
> Ian
>
> > >
> > >> The reason we once added the TopdownL1 for ICL and later platform is
> > >> that it doesn't break the original design (a *clean* output).
> > >
> > > Right, and in 6.3-rc7 the aggregation of counts was broken because of
> > > duplicated counts and hard coded metrics (I did a last minute partial
> > > fix). In perf-tools-next aggregation was fixed and we switched to the
> > > json metrics, that are accurate and up-to-date with the latest TMA
> > > metrics, so that we wouldn't need to maintain a duplicate code path.
> > > What keys enabling TopdownL1 in 6.3 is the presence of topdown events
> > > whilst in perf-tools-next it is the presence of TopdownL1 metric
> > > group, as this is a more consistent approach and had first been
> > > proposed by ARM.
> >
> > A consistent approach is good only when it can benefits all parties
> > rather than sacrifices any of them.
> >
> > Apparently, the approach in the perf-tools-next brings all kinds of
> > issues, multiplexing/inaccuracy in the perf stat default on Intel
> > platforms. Why cannot we fix it properly before applying the approach?
> >
> > I think Andi also mentioned the similar request when ARM introduced the
> > TopdownL1 metrics.
> > https://lore.kernel.org/linux-perf-users/12e0deef-08db-445f-4958-bcd5c3e10367@xxxxxxxxxxxxxxx/
> >
> > Thanks,
> > Kan