Re: [PATCH v1 06/14] libperf cpumap: Add any, empty and min helpers

From: Ian Rogers
Date: Tue Dec 12 2023 - 15:27:30 EST


On Tue, Dec 12, 2023 at 7:06 AM James Clark <james.clark@xxxxxxx> wrote:
>
>
>
> On 29/11/2023 06:02, Ian Rogers wrote:
> > Additional helpers to better replace
> > perf_cpu_map__has_any_cpu_or_is_empty.
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/lib/perf/cpumap.c | 27 +++++++++++++++++++++++++++
> > tools/lib/perf/include/perf/cpumap.h | 16 ++++++++++++++++
> > tools/lib/perf/libperf.map | 4 ++++
> > 3 files changed, 47 insertions(+)
> >
> > diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> > index 49fc98e16514..7403819da8fd 100644
> > --- a/tools/lib/perf/cpumap.c
> > +++ b/tools/lib/perf/cpumap.c
> > @@ -316,6 +316,19 @@ bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> > return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true;
> > }
> >
> > +bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> > +{
> > + if (!map)
> > + return true;
> > +
> > + return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 0).cpu == -1;
> > +}
>
> I'm struggling to understand the relevance of the difference between
> has_any and is_any I see that there is a slight difference, but could it
> not be refactored out so we only need one?

Yep, that's what these changes are working toward. For has any the set
{-1, 0, 1} would return true while is any will return false.
Previously the has any behavior was called "empty" which I think is
actively misleading.

> Do you ever get an 'any' map that has more than 1 entry? It's quite a
> subtle difference that is_any returns false if the first one is 'any'
> but then there are subsequent entries. Whereas has_any would return
> true. I'm not sure if future readers would be able to appreciate that.
>
> I see has_any is only used twice, both on evlist->all_cpus. Is there
> something about that member that means it could have a map that has an
> 'any' mixed with CPUs? Wouldn't that have the same result as a normal
> 'any' anyway?

The dummy event may be opened on any CPU but then a particular event
may be opened on certain CPUs. We merge CPU maps in places like evlist
so that we can iterate the appropriate CPUs for events and
open/enable/disable/close all events on a certain CPU at the same time
(we also set the affinity to that CPU to avoid IPIs). What I'm hoping
to do in these changes is reduce the ambiguity, the corner cases are
by their nature unusual.

An example of a corner case is, uncore events often get opened just on
CPU 0 but on a multi-socket system you may have a CPU 32 that also
needs to open the event. Previous code treated the CPU map index and
value it contained pretty interchangeably. This is often fine for the
core PMU but is clearly wrong in this uncore case, {0, 32} has indexes
0 and 1 but those indexes don't match the CPU numbers. The case of -1
has often previously been called dummy but I'm trying to call it the
"any CPU" case to match the perf_event_open man page (I'm hoping it
also makes it less ambiguous with any CPU being used with a particular
event like cycles, calling it dummy makes the event sound like it may
have sideband data). The difference between "all CPUs" and "any CPU"
is that an evsel for all CPUs would need the event opening
individually on each CPU, while any CPU events are a single open call.
Any CPU is only valid to perf_event_open if a PID is specified.
Depending on the set up there could be overlaps in what they count but
hopefully it is clearer what the distinction is. I believe the case of
"any CPU" and specific CPU numbers is more common with aux buffers and
Adrian has mentioned needing it for intel-pt.

Thanks,
Ian

> > +
> > +bool perf_cpu_map__is_empty(const struct perf_cpu_map *map)
> > +{
> > + return map == NULL;
> > +}
> > +
> > int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
> > {
> > int low, high;
> > @@ -372,6 +385,20 @@ bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map)
> > return map && __perf_cpu_map__cpu(map, 0).cpu == -1;
> > }
> >
> > +struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map)
> > +{
> > + struct perf_cpu cpu, result = {
> > + .cpu = -1
> > + };
> > + int idx;
> > +
> > + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) {
> > + result = cpu;
> > + break;
> > + }
> > + return result;
> > +}
> > +
> > struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map)
> > {
> > struct perf_cpu result = {
> > diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
> > index dbe0a7352b64..523e4348fc96 100644
> > --- a/tools/lib/perf/include/perf/cpumap.h
> > +++ b/tools/lib/perf/include/perf/cpumap.h
> > @@ -50,6 +50,22 @@ LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
> > * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value.
> > */
> > LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map);
> > +/**
> > + * perf_cpu_map__is_any_cpu_or_is_empty - is map either empty or the "any CPU"/dummy value.
> > + */
> > +LIBPERF_API bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map);
> > +/**
> > + * perf_cpu_map__is_empty - does the map contain no values and it doesn't
> > + * contain the special "any CPU"/dummy value.
> > + */
> > +LIBPERF_API bool perf_cpu_map__is_empty(const struct perf_cpu_map *map);
> > +/**
> > + * perf_cpu_map__min - the minimum CPU value or -1 if empty or just the "any CPU"/dummy value.
> > + */
> > +LIBPERF_API struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map);
> > +/**
> > + * perf_cpu_map__max - the maximum CPU value or -1 if empty or just the "any CPU"/dummy value.
> > + */
> > LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map);
> > LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu);
> > LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs,
> > diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
> > index 10b3f3722642..2aa79b696032 100644
> > --- a/tools/lib/perf/libperf.map
> > +++ b/tools/lib/perf/libperf.map
> > @@ -10,6 +10,10 @@ LIBPERF_0.0.1 {
> > perf_cpu_map__nr;
> > perf_cpu_map__cpu;
> > perf_cpu_map__has_any_cpu_or_is_empty;
> > + perf_cpu_map__is_any_cpu_or_is_empty;
> > + perf_cpu_map__is_empty;
> > + perf_cpu_map__has_any_cpu;
> > + perf_cpu_map__min;
> > perf_cpu_map__max;
> > perf_cpu_map__has;
> > perf_thread_map__new_array;