Re: [PATCH v1] perf x86 test: Update hybrid expectations

From: Ian Rogers
Date: Wed Jan 03 2024 - 12:17:46 EST


On Wed, Jan 3, 2024 at 8:42 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> Em Tue, Jan 02, 2024 at 01:57:32PM -0800, Ian Rogers escreveu:
> > The legacy events cpu-cycles and instructions have sysfs event
> > equivalents on x86 (see /sys/devices/cpu_core/events). As sysfs/JSON
> > events are now higher in priority than legacy events this causes the
> > hybrid test expectations not to be met. To fix this switch to legacy
> > events that don't have sysfs versions, namely cpu-cycles becomes
> > cycles and instructions becomes branches.
> >
> > Fixes: a24d9d9dc096 ("perf parse-events: Make legacy events lower priority than sysfs/JSON")
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
>
> With it:
>
> root@number:/home/acme# perf test hybrid
> 71: Intel PT :
> 71.2: Intel PT hybrid CPU compatibility : Ok
> 75: x86 hybrid : Ok
> root@number:/home/acme#
>
> Applied.
>
> Now to look at this on this hybrid system (14700K):
>
> 101: perf all metricgroups test : FAILED!
>
> Testing Mem
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
> \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'cpu_core'
>
> Initial error:
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
> \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'
>
> valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id
> test child finished with -1
> ---- end ----
> perf all metricgroups test: FAILED!
> root@number:/home/acme# grep -m1 "model name" /proc/cpuinfo
> model name : Intel(R) Core(TM) i7-14700K
> root@number:/home/acme#
>
>
> root@number:/home/acme# ls -la /sys/devices/uncore_
> uncore_arb_0/ uncore_cbox_1/ uncore_cbox_2/ uncore_cbox_5/ uncore_cbox_8/ uncore_imc_0/ uncore_imc_free_running_1/
> uncore_arb_1/ uncore_cbox_10/ uncore_cbox_3/ uncore_cbox_6/ uncore_cbox_9/ uncore_imc_1/
> uncore_cbox_0/ uncore_cbox_11/ uncore_cbox_4/ uncore_cbox_7/ uncore_clock/ uncore_imc_free_running_0/
> root@number:/home/acme# ls -la /sys/devices/uncore_
>
>
> 102: perf all metrics test : FAILED!
>
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
> \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'cpu_core'
>
> Initial error:
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
> \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'
>
> valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id

I'll take a look. UNC_ARB* events are going to be using uncore_arb_*
PMUs and so the cpu_core PMU shouldn't be specified. This looks like a
bug in how the metric is generated.

> Testing UNCORE_FREQ
> Metric 'UNCORE_FREQ' not printed in:
> event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
> \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'tma_info_system_socket_clks'
>
> Initial error:
> event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
> \___ Cannot find PMU `tma_info_system_socket_clks'. Missing kernel support?
> Testing tma_info_system_socket_clks

Similar bug but different as differing PMUs aren't involved:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1459

I also see what may be a PMU driver bug in:
```
...
Metric 'tma_slow_pause' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 11.657 usec (+- 0.039 usec)
Average num. events: 4.000 (+- 0.000)
Average time per event 2.914 usec
Average data synthesis took: 11.832 usec (+- 0.037 usec)
Average num. events: 13.000 (+- 0.000)
Average time per event 0.910 usec

Performance counter stats for 'perf bench internals synthesize':

<not counted> cpu_core/TOPDOWN.SLOTS/
(0.00%)
<not counted> cpu_core/topdown-retiring/
(0.00%)
<not counted> cpu_core/topdown-mem-bound/
(0.00%)
<not counted> cpu_core/topdown-bad-spec/
(0.00%)
<not counted> cpu_core/topdown-fe-bound/
(0.00%)
<not counted> cpu_core/topdown-be-bound/
(0.00%)
<not counted> cpu_core/RESOURCE_STALLS.SCOREBOARD/
(0.00%)
<not counted> cpu_core/EXE_ACTIVITY.1_PORTS_UTIL/
(0.00%)
<not counted> cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/
(0.00%)
<not counted> cpu_core/CPU_CLK_UNHALTED.PAUSE/
(0.00%)
<not counted> cpu_core/CYCLE_ACTIVITY.STALLS_TOTAL/
(0.00%)
<not counted> cpu_core/CPU_CLK_UNHALTED.THREAD/
(0.00%)
<not counted> cpu_core/ARITH.DIV_ACTIVE/
(0.00%)
<not counted> cpu_core/EXE_ACTIVITY.2_PORTS_UTIL,umask=0xc/
(0.00%)
<not counted> cpu_core/EXE_ACTIVITY.3_PORTS_UTIL,umask=0x80/
(0.00%)

0.327060340 seconds time elapsed

0.114906000 seconds user
0.210001000 seconds sys
...
```

as adding --metric-no-group fixes the issue. Adding --metric-no-group
shouldn't be necessary as perf_event_open should be failing causing
the weak group to break (hence the possible PMU driver bug). Perhaps
there is something erroneous in weak group breaking on hybrid.

Thanks,
Ian

> - Arnaldo