Re: [PATCH] perf stat: Fix per socket shadow aggregation

From: Ian Rogers
Date: Mon Dec 06 2021 - 17:16:45 EST


On Mon, Dec 6, 2021 at 2:44 AM James Clark <james.clark@xxxxxxx> wrote:
>
>
>
> On 04/12/2021 02:34, Ian Rogers wrote:
> > An uncore device may have a CPU mask that specifies one CPU per socket:
> > $ cat /sys/devices/uncore_imc_0/cpumask
> > 0,18
> > The perf_stat_config aggr_map will map a CPU to the socket and other
> > aggregation values for it. Fix an error where the index into CPU mask
> > was being used as the index into the aggr_map. For the cpumask above the
> > indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/perf/util/stat-display.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 588601000f3f..7cfad5cfec38 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
> > static void aggr_update_shadow(struct perf_stat_config *config,
> > struct evlist *evlist)
> > {
> > - int cpu, s;
> > + int idx, cpu, s;
> > struct aggr_cpu_id s2, id;
> > u64 val;
> > struct evsel *counter;
> > @@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
> > id = config->aggr_map->map[s];
> > evlist__for_each_entry(evlist, counter) {
> > val = 0;
> > - for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
> > + for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
> > + cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
> > s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
>
> Hi Ian,
>
> This same pattern of looping over the CPUs and calling aggr_get_id() is used a couple of
> other times. For example in aggr_cb() and first_shadow_cpu(). Do you think these also
> need updating?

Thanks for the feedback James!

For first_shadow_cpu the index is translated to the the cpu via the
cpu map here:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n343
so I think it is sound.

aggr_cb looks to have the same bug as the index is being passed as the cpu:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n643

> Or could we fix it in the aggr_get_id() functions so that they expect an index instead
> of CPU ID and do the conversion themselves? The callbacks do say "idx" rather than "cpu"
> so maybe there is still come confusion.
>
> For example:
>
> perf_stat__get_die_cached(struct perf_stat_config *config,
> struct perf_cpu_map *map, int idx)

Agreed on the naming confusion. I'm a fan of using single element
structs to get type safety in code like this. I wonder here if a
for_each_cpu on perf_cpu_map would clean this up best. I'll play with
a v2 patch set that addresses this problem more widely.

Thanks,
Ian

> > if (!cpu_map__compare_aggr_cpu_id(s2, id))
> > continue;
> > - val += perf_counts(counter->counts, cpu, 0)->val;
> > + val += perf_counts(counter->counts, idx, 0)->val;
> > }
> > perf_stat__update_shadow_stats(counter, val,
> > first_shadow_cpu(config, counter, id),
> >