[perf stat] Extend --cpu to non-system-wide runs too? was Re: [PATCH v3] perf bench sched pipe: Add -G/--cgroups option

From: Arnaldo Carvalho de Melo
Date: Tue Oct 17 2023 - 08:28:18 EST


Em Tue, Oct 17, 2023 at 01:40:07PM +0200, Ingo Molnar escreveu:
> Side note: it might make sense to add a sane cpumask/affinity setting
> option to perf stat itself:
>
> perf stat --cpumask
>
> ... or so?
>
> We do have -C:
>
> -C, --cpu <cpu> list of cpus to monitor in system-wide
>
> ... but that's limited to --all-cpus, right?
>
> Perhaps we could extend --cpu to non-system-wide runs too?

Maybe I misunderstood your question, but its a list of cpus to limit the
counting:

On a mostly idle system (some browsers, etc):

[root@five ~]# perf stat -C 0,2 -e cycles -I 1000
# time counts unit events
1.001012960 207,999,675 cycles
2.002152464 157,058,633 cycles
3.002985969 174,590,102 cycles
4.003411871 216,250,416 cycles
5.004392310 180,537,857 cycles
6.005387846 171,036,571 cycles
7.006386564 156,461,753 cycles
8.007532366 158,010,466 cycles
9.008682339 164,971,366 cycles
^C 9.377946210 77,242,809 cycles

[root@five ~]#

Then:

[root@five ~]# perf stat -C 0 -e cycles -I 1000
# time counts unit events
1.001019469 69,833,637 cycles
2.002133490 111,297,731 cycles
3.003225211 90,416,063 cycles
4.003663853 34,189,884 cycles
5.004689751 34,583,822 cycles
6.005659918 33,284,110 cycles
7.006660396 62,080,246 cycles
^C 7.229236075 23,250,207 cycles

[root@five ~]#

But:

[root@five ~]# taskset -c 0 stress-ng --cpu 32 &
[1] 9859
[root@five ~]# stress-ng: info: [9859] defaulting to a 1 day, 0 secs run per stressor
stress-ng: info: [9859] dispatching hogs: 32 cpu

[root@five ~]#

[root@five ~]# perf stat -C 0,2 -e cycles -I 1000
# time counts unit events
1.001024379 4,838,680,041 cycles
2.008891551 4,849,936,963 cycles
3.017168975 4,835,710,170 cycles
4.025437789 4,847,294,589 cycles
5.033239780 4,825,463,385 cycles
6.039332959 4,834,989,373 cycles
^C 6.067478756 125,338,359 cycles

[root@five ~]# perf stat -C 2 -e cycles -I 1000
# time counts unit events
1.000215845 21,244,609 cycles
2.001216573 51,337,887 cycles
3.002278103 49,421,924 cycles
4.003339432 33,270,235 cycles
^C 4.338990744 14,178,759 cycles

[root@five ~]# perf stat -C 0 -e cycles -I 1000
# time counts unit events
1.000801562 4,767,090,700 cycles
2.001800540 4,761,384,154 cycles
3.002801468 4,768,816,073 cycles
^C 3.313349213 1,479,254,494 cycles

[root@five ~]#

If we try to specify a pid and cpu:

[root@five ~]# taskset -c 0 sleep 100m &
[2] 9964
[root@five ~]#
[root@five ~]# perf stat -C 0 -p 9964 -e cycles -I 1000
PID/TID switch overriding CPU
# time counts unit events
1.000929383 <not counted> cycles
2.001933839 <not counted> cycles
3.002927605 <not counted> cycles
4.003983793 <not counted> cycles
5.005051180 <not counted> cycles
6.006123168 <not counted> cycles
7.007182796 <not counted> cycles
8.008261274 <not counted> cycles
9.009324991 <not counted> cycles
^C 9.454324736 <not counted> cycles

[root@five ~]#


[root@five ~]# pidof stress-ng
9891 9890 9889 9888 9887 9886 9885 9884 9883 9882 9881 9880 9879 9878 9877 9876 9875 9874 9873 9872 9871 9870 9869 9868 9867 9866 9865 9864 9863 9862 9861 9860 9859
[root@five ~]# perf stat -C 0 -p 9860 -e cycles -I 1000
PID/TID switch overriding CPU
# time counts unit events
1.001045336 144,691,886 cycles
2.002170624 134,088,343 cycles
3.003257911 149,148,823 cycles
^C 3.301585761 40,468,152 cycles

[root@five ~]#

Do you want to profile some specific PID only when it runs on some
specific CPU?

That should work, as per man perf_event_open:

pid == 0 and cpu >= 0
This measures the calling process/thread only when running on the specified CPU.

But, as we saw above, tooling is preventing us from doing that :-\

- Arnaldo