Re: [PATCH v3 35/46] perf parse-events: Support hardware events as terms

From: Ian Rogers
Date: Tue May 02 2023 - 13:57:25 EST


On Tue, May 2, 2023 at 3:56 AM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote:
>
> > @@ -269,6 +279,16 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
> > aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
> > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
> > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
> > +cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
> > +stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
> > +stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
> > +instructions { return hw_term(yyscanner, PERF_COUNT_HW_INSTRUCTIONS); }
> > +cache-references { return hw_term(yyscanner, PERF_COUNT_HW_CACHE_REFERENCES); }
> > +cache-misses { return hw_term(yyscanner, PERF_COUNT_HW_CACHE_MISSES); }
> > +branch-instructions|branches { return hw_term(yyscanner, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
> > +branch-misses { return hw_term(yyscanner, PERF_COUNT_HW_BRANCH_MISSES); }
> > +bus-cycles { return hw_term(yyscanner, PERF_COUNT_HW_BUS_CYCLES); }
> > +ref-cycles { return hw_term(yyscanner, PERF_COUNT_HW_REF_CPU_CYCLES); }
>
> These are generic terms and thus can be added to _any_ pmus. Ex:
>
> Before:
> ```
> $ sudo ./perf stat -e amd_l3/cycles/ -C 0 -- sleep 1
> event syntax error: 'amd_l3/cycles/'
> \___ unknown term 'cycles' for pmu 'amd_l3'
> ```
>
> After:
> ```
> $ sudo ./perf stat -e amd_l3/cycles/ -C 0 -vv -- sleep 1
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> size 136
> config 0xb00000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
> sys_perf_event_open failed, error -2
> Warning:
> amd_l3/cycles/ event is not supported by the kernel.
> failed to read counter amd_l3/cycles/
>
> Performance counter stats for 'CPU(s) 0':
>
> <not supported> amd_l3/cycles/
>
> 1.059391819 seconds time elapsed
> ```
>
> Here `type=0` which means perf is trying to open event on HW pmu instead
> of amd_l3//. `config` is 0xb00000000 where 0xb is amd_l3 pmu type:
>
> $ cat /sys/bus/event_source/devices/amd_l3/type
> 11
>
> Not sure if this is expected behavior.

I agree this is definitely a change, I'm not sure what we can do to
handle this better. Firstly, let's not be overly negative, the code
fixes it so that "branch-brs" (an AMD sysfs event) is now a valid
event name rather than a parse error. For the example above we have
the case of an unsupported event now not giving a parse error, only
possible as all PMUs were scanned, but giving an open error as the
event name matches in the list of hardware events but the extended
type encoding of the PMU yields a PMU that can't handle it.

I think parsing should be concerned with taking information to yield a
struct perf_event_attr. When it gets into the game of testing PMUs for
support then we easily get test regressions, as the test system may
lack the PMUs that are part of the test. We also get issues on hybrid
like PERF_TYPE_HW_CACHE events being supported by one PMU and not the
other. The existing hybrid behavior is to silently skip the events
that aren't supported - tested with a perf_event_open. The behavior
after these patches is to open the event on both PMUs but display not
supported/counted generally on the atom PMU. So the coding decision
there was to say parsing and opening are separate stages, and the
correct place to fail was in the open.

We could change the event parsing for this case, have different parser
notions of core PMU and every PMU, then only allow the hardware events
on the core PMU. This would require hard coding a core PMU list into
the parser and I'm not keen on that.

Thanks,
Ian