Re: [RFC PATCH v1] perf parse-events: Make legacy events lower priority than sysfs/json

From: Ian Rogers
Date: Thu Nov 23 2023 - 10:30:03 EST


On Thu, Nov 23, 2023 at 7:16 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Thu, 23 Nov 2023 04:29:22 +0000,
> Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >
> > The perf tool has previously made legacy events the priority so with
> > or without a PMU the legacy event would be opened:
> >
> > ```
> > $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
> > Using CPUID GenuineIntel-6-8D-1
> > intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
> > Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
> > After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
> > Control descriptor is not initialized
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 0 (PERF_TYPE_HARDWARE)
> > size 136
> > config 0 (PERF_COUNT_HW_CPU_CYCLES)
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 833967 cpu -1 group_fd -1 flags 0x8 = 3
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 0 (PERF_TYPE_HARDWARE)
> > size 136
> > config 0 (PERF_COUNT_HW_CPU_CYCLES)
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > ...
> > ```
> >
> > Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs
> > capable of opening legacy events on each, removing hard coded
> > "cpu_core" and "cpu_atom" Intel PMU names, etc. caused a behavioral
> > difference on Apple/ARM due to latent issues in the PMU driver
> > reported in:
> > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@xxxxxxxxx/
> >
>
> What issue? So far, I don't see anything that remotely looks like a
> kernel issue.
>
> > As part of that report Mark Rutland <mark.rutland@xxxxxxx> requested
> > that legacy events not be higher in priority when a PMU is specified
> > reversing what has until this change been perf's default
> > behavior. With this change the above becomes:
> >
> > ```
> > $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
> > Using CPUID GenuineIntel-6-8D-1
> > Attempt to add: cpu/cpu-cycles=0/
> > ..after resolving event: cpu/event=0x3c/
> > Control descriptor is not initialized
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 0 (PERF_TYPE_HARDWARE)
> > size 136
> > config 0 (PERF_COUNT_HW_CPU_CYCLES)
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 827628 cpu -1 group_fd -1 flags 0x8 = 3
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 4 (PERF_TYPE_RAW)
> > size 136
> > config 0x3c
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > ...
> > ```
> >
> > So the second event has become a raw event as
> > /sys/devices/cpu/events/cpu-cycles exists.
> >
> > A fix was necessary to config_term_pmu in parse-events.c as
> > check_alias expansion needs to happen after config_term_pmu, and
> > config_term_pmu may need calling a second time because of this.
> >
> > config_term_pmu is updated to not use the legacy event when the PMU
> > has such a named event (either from json or sysfs).
> >
> > The bulk of this change is updating all of the parse-events test
> > expectations so that if a sysfs/json event exists for a PMU the test
> > doesn't fail - a further sign, if it were needed, that the legacy
> > event priority was a known and tested behavior of the perf tool.
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
>
> Reported-by:, Link: and Fixes: tags would be appreciated.
>
> > ---
> > This is a large behavioral change:
> > 1) the scope of the change means it should bake on linux-next and I
> > don't believe should be a 6.7-rc fix.
>
> I entirely disagree. Distros are shipping a broken perf tool.
>
> > 2) a fixes tag and stable backport I don't think are appropriate. The
> > real reported issue is with the PMU driver. A backport would bring the
> > risk that later fixes, due to the large behavior change, wouldn't be
> > backported and past releases get regressed in scenarios like
> > hybrid. Backports for the perf tool are also less necessary than say a
> > buggy PMU driver, as distributions should be updating to the latest
> > perf tool regardless of what Linux kernel is being run (the perf tool
> > is backward compatible).
>
> Again, perf gets shipped in distros, and not necessary as the latest
> version. Rather, they tend to ship the version matching the kernel. No
> backport, buggy perf.

Please complain to the distros. I complained to Debian, we got rid of
the horrible wrapper script thing they did. I complained to two
separate Ubuntu people over the last two weeks as they still have
broken packaging even though they derive from Debian. Fedora is of
course perfect as Arnaldo oversees it :-)

> And again, I don't see a bug in the PMU driver.

Whether the PMU driver is requested a legacy cycles event or the
cycles event as an event code, the PMU driver should support it.
Supporting legacy events is just something core PMU drivers do. This
workaround wouldn't be necessary were it not for this PMU bug.

This change impacts every user of perf not just a partial fix to
workaround ARM PMU driver issues, see the updated parse-events test
for a list of what a simple test sees as a behavior change.

Thanks,
Ian

> Now, in the interest of moving forward: this patch seems to address
> the basic problems I was seeing on both M1/M2 (running any kernel
> version) and Juno (running an older kernel), so:
>
> Tested-by: Marc Zyngier <maz@xxxxxxxxxx>
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.