Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus

From: Ian Rogers
Date: Wed Dec 13 2023 - 12:43:05 EST


On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023-12-12 8:06 p.m., Namhyung Kim wrote:
> > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >>
> >> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>>
> >>>
> >>>
> >>> On 2023-12-12 3:37 p.m., Ian Rogers wrote:
> >>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>>>>
> >>>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >>>>>
> >>>>> perf top errors out on a hybrid machine
> >>>>> $perf top
> >>>>>
> >>>>> Error:
> >>>>> The cycles:P event is not supported.
> >>>>>
> >>>>> The perf top expects that the "cycles" is collected on all CPUs in the
> >>>>> system. But for hybrid there is no single "cycles" event which can cover
> >>>>> all CPUs. Perf has to split it into two cycles events, e.g.,
> >>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
> >>>>> If a event is opened on the unsupported CPU. The open fails. That's the
> >>>>> reason of the above error out.
> >>>>>
> >>>>> Perf should only open the cycles event on the corresponding CPU. The
> >>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
> >>>>> core PMU maps") intersect the requested CPU map with the CPU map of the
> >>>>> PMU. Use the evsel's cpus to replace user_requested_cpus.
> >>>>>
> >>>>> The evlist's threads are also propagated to the evsel's threads in
> >>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends
> >>>>> a dummy event and assign it to the evsel's threads. For a per-thread
> >>>>> event, the evlist's thread_map is assigned to the evsel's threads. The
> >>>>> same as the other tools, e.g., perf record, using the evsel's threads
> >>>>> when opening an event.
> >>>>>
> >>>>> Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> >>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@xxxxxxxxxx/
> >>>>> Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
> >>>>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >>>>> ---
> >>>>>
> >>>>> Changes since V1:
> >>>>> - Update the description
> >>>>> - Add Reviewed-by from Ian
> >>>>
> >>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to
> >>>> select between the cycles event on cpu_atom and cpu_core?
> >>>
> >>> Yes, but the event doesn't include the PMU information.
> >>> We probably need a follow up patch to append the PMU name.
> >>>
> >>> Available samples
> >>> 385 cycles:P
> >>>
> >>> 903 cycles:P
> >>
> >> Thanks and agreed, it isn't possible to tell which is which PMU/CPU
> >> type at the moment. I tried the patch with perf top --stdio, there
> >> wasn't a choice of event
>
> The perf top --stdio uses a dedicated display function, see
> perf_top__header_snprintf() in util/top.c
>
> It only shows one event at a time. "E" is used to switch the event.
>
> >> and I can't tell what counter is being
> >> displayed.
>
> For the hybrid case, I think we may display both PMU name and event
> name. For example,
>
> Available samples
> 656 cycles:P cpu_atom
>
> 701 cycles:P cpu_core

I think there would be more uniformity if we could do:
cpu/cycles/P
I'm just reminded of the stat output where sometimes you can get things like:
cpu/cycles/
or sometimes:
cycles [cpu]
It seems the slash style approach is the most uniform given it looks
like what is written on the command line. Perhaps we need some kind of
helper function to do this as reformatting the modifier is a pain.

> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index f4812b226818..afc7a1d54fe4 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -3433,8 +3433,10 @@ static void perf_evsel_menu__write(struct
> ui_browser *browser,
> }
>
> nr_events = convert_unit(nr_events, &unit);
> - printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s", nr_events,
> - unit, unit == ' ' ? "" : " ", ev_name);
> + printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s %s", nr_events,
> + unit, unit == ' ' ? "" : " ", ev_name,
> + evsel->pmu ? evsel->pmu_name : "");
> +
> ui_browser__printf(browser, "%s", bf);
>
> nr_events = evsel->evlist->stats.nr_events[PERF_RECORD_LOST];
>
>
> >> When I quit I also see:
> >> ```
> >> exiting.
> >> corrupted double-linked list
> >> Aborted (core dumped)
> >> ```
> >> but I wasn't able to repro this on a debuggable binary/system.
>
> I haven't see the issue yet.
>
> >>
> >> If my memory serves there was a patch where perf top was showing >1
> >> event. It would be nice here to do some kind of hybrid merging rather
> >> than having to view each PMU's top separately.
>
> The current perf top doesn't merge when there are >1 event.
> sudo ./perf top -e "cycles,instructions"
>
> Available samples
> 2K cycles
>
> 2K instructions
>
> For now, I think we may just append a PMU name to distinguish the hybrid
> case.
>
> We may implement the merging feature which impacts both hybrid and
> non-hybrid cases later separately.
>
> >
> > Using event groups, but I noticed you removed the --group option.
> > Maybe perf top can just use `{ ... }` notation for explicit grouping,
> > but it might be implicit like in the hybrid case.
> >
>
> Yes, if the events are from different PMUs, the perf tool will
> implicitly de-group the hybrid events. "{ ... }" may not help here.

I think grouping may have been the situation where I saw >1 event
displayed but I agree with Kan, we implicitly de-group events on
different PMUs so it won't help here.

Thanks,
Ian

> Thanks,
> Kan