Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus

From: Ian Rogers
Date: Tue Dec 12 2023 - 17:12:45 EST


On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023-12-12 3:37 p.m., Ian Rogers wrote:
> > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>
> >> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >>
> >> perf top errors out on a hybrid machine
> >> $perf top
> >>
> >> Error:
> >> The cycles:P event is not supported.
> >>
> >> The perf top expects that the "cycles" is collected on all CPUs in the
> >> system. But for hybrid there is no single "cycles" event which can cover
> >> all CPUs. Perf has to split it into two cycles events, e.g.,
> >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
> >> If a event is opened on the unsupported CPU. The open fails. That's the
> >> reason of the above error out.
> >>
> >> Perf should only open the cycles event on the corresponding CPU. The
> >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
> >> core PMU maps") intersect the requested CPU map with the CPU map of the
> >> PMU. Use the evsel's cpus to replace user_requested_cpus.
> >>
> >> The evlist's threads are also propagated to the evsel's threads in
> >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends
> >> a dummy event and assign it to the evsel's threads. For a per-thread
> >> event, the evlist's thread_map is assigned to the evsel's threads. The
> >> same as the other tools, e.g., perf record, using the evsel's threads
> >> when opening an event.
> >>
> >> Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@xxxxxxxxxx/
> >> Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
> >> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >> ---
> >>
> >> Changes since V1:
> >> - Update the description
> >> - Add Reviewed-by from Ian
> >
> > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to
> > select between the cycles event on cpu_atom and cpu_core?
>
> Yes, but the event doesn't include the PMU information.
> We probably need a follow up patch to append the PMU name.
>
> Available samples
> 385 cycles:P
>
> 903 cycles:P

Thanks and agreed, it isn't possible to tell which is which PMU/CPU
type at the moment. I tried the patch with perf top --stdio, there
wasn't a choice of event and I can't tell what counter is being
displayed. When I quit I also see:
```
exiting.
corrupted double-linked list
Aborted (core dumped)
```
but I wasn't able to repro this on a debuggable binary/system.

If my memory serves there was a patch where perf top was showing >1
event. It would be nice here to do some kind of hybrid merging rather
than having to view each PMU's top separately.

Thanks,
Ian


> Thanks,
> Kan
>
> > I'm
> > wondering if there is some kind of missing "hybrid-merge"
> > functionality like we have for perf stat.
> >
> > Thanks,
> > Ian
> >
> >> tools/perf/builtin-top.c | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> >> index ea8c7eca5eee..cce9350177e2 100644
> >> --- a/tools/perf/builtin-top.c
> >> +++ b/tools/perf/builtin-top.c
> >> @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top)
> >>
> >> evlist__for_each_entry(evlist, counter) {
> >> try_again:
> >> - if (evsel__open(counter, top->evlist->core.user_requested_cpus,
> >> - top->evlist->core.threads) < 0) {
> >> + if (evsel__open(counter, counter->core.cpus,
> >> + counter->core.threads) < 0) {
> >>
> >> /*
> >> * Specially handle overwrite fall back.
> >> --
> >> 2.35.1
> >>