Re: [PATCH v2] perf stat: Introduce skippable evsels

From: Ian Rogers
Date: Tue Apr 18 2023 - 20:13:53 EST


On Tue, Apr 18, 2023 at 2:51 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023-04-18 4:08 p.m., Ian Rogers wrote:
> > On Tue, Apr 18, 2023 at 11:19 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 2023-04-18 11:43 a.m., Ian Rogers wrote:
> >>> On Tue, Apr 18, 2023 at 6:03 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2023-04-17 2:13 p.m., Ian Rogers wrote:
> >>>>> The json TopdownL1 is enabled if present unconditionally for perf stat
> >>>>> default. Enabling it on Skylake has multiplexing as TopdownL1 on
> >>>>> Skylake has multiplexing unrelated to this change - at least on the
> >>>>> machine I was testing on. We can remove the metric group TopdownL1 on
> >>>>> Skylake so that we don't enable it by default, there is still the
> >>>>> group TmaL1. To me, disabling TopdownL1 seems less desirable than
> >>>>> running with multiplexing - previously to get into topdown analysis
> >>>>> there has to be knowledge that "perf stat -M TopdownL1" is the way to
> >>>>> do this.
> >>>>
> >>>> To be honest, I don't think it's a good idea to remove the TopdownL1. We
> >>>> cannot remove it just because the new way cannot handle it. The perf
> >>>> stat default works well until 6.3-rc7. It's a regression issue of the
> >>>> current perf-tools-next.
> >>>
> >>> I'm not so clear it is a regression to consistently add TopdownL1 for
> >>> all architectures supporting it.
> >>
> >>
> >> Breaking the perf stat default is a regression.
> >
> > Breaking is overstating the use of multiplexing. The impact is less
> > accuracy in the IPC and branch misses default metrics,
>
> Inaccuracy is a breakage for the default.

Can you present a case where this matters? The events are already not
grouped and so inaccurate for metrics.

> > if multiplexing
> > is necessary on your Intel architecture. I believe TopdownL1 is more
> > useful than either of these metrics and so having TopdownL1 be a
> > default is an improvement. We can add a patch, as I keep repeating
> > this is off-topic for this patch, to make it so that TopdownL1 isn't
> > enabled on Intel CPUs pre-Icelake, but I would discourage this.
>
>
> We need the TopdownL1. We just don't need TopdownL1 in the perf stat
> default when perf metrics feature is not available.

Perf metrics is an Intel only Icelake+ feature. I suggest the simplest
way to achieve this would be to remove the TopdownL1 metric group from
all Intel metrics before Icelake. This will mean on these
architectures the group TmaL1 will need using instead.

Thanks,
Ian

> >
> >> The reason we once added the TopdownL1 for ICL and later platform is
> >> that it doesn't break the original design (a *clean* output).
> >
> > Right, and in 6.3-rc7 the aggregation of counts was broken because of
> > duplicated counts and hard coded metrics (I did a last minute partial
> > fix). In perf-tools-next aggregation was fixed and we switched to the
> > json metrics, that are accurate and up-to-date with the latest TMA
> > metrics, so that we wouldn't need to maintain a duplicate code path.
> > What keys enabling TopdownL1 in 6.3 is the presence of topdown events
> > whilst in perf-tools-next it is the presence of TopdownL1 metric
> > group, as this is a more consistent approach and had first been
> > proposed by ARM.
>
> A consistent approach is good only when it can benefits all parties
> rather than sacrifices any of them.
>
> Apparently, the approach in the perf-tools-next brings all kinds of
> issues, multiplexing/inaccuracy in the perf stat default on Intel
> platforms. Why cannot we fix it properly before applying the approach?
>
> I think Andi also mentioned the similar request when ARM introduced the
> TopdownL1 metrics.
> https://lore.kernel.org/linux-perf-users/12e0deef-08db-445f-4958-bcd5c3e10367@xxxxxxxxxxxxxxx/
>
> Thanks,
> Kan