Re:Re: [Regression or Fix]perf: profiling stats sigificantly changed for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0

From: David Wang
Date: Wed Nov 15 2023 - 23:09:29 EST



At 2023-11-16 00:26:06, "Namhyung Kim" <namhyung@xxxxxxxxxx> wrote:
>On Wed, Nov 15, 2023 at 8:12 AM David Wang <00107082@xxxxxxx> wrote:
>>
>>
>> 在 2023-11-15 23:48:33,"Namhyung Kim" <namhyung@xxxxxxxxxx> 写道:
>> >On Wed, Nov 15, 2023 at 3:00 AM David Wang <00107082@xxxxxxx> wrote:
>> >>
>> >>
>> >>
>> >> At 2023-11-15 18:32:41, "Peter Zijlstra" <peterz@xxxxxxxxxxxxx> wrote:
>> >> >
>> >> >Namhyung, could you please take a look, you know how to operate this
>> >> >cgroup stuff.
>> >> >
>> >>
>> >> More information, I run the profiling with 8cpu machine on a SSD with ext4 filesystem :
>> >>
>> >> # mkdir /sys/fs/cgroup/mytest
>> >> # echo $$ > /sys/fs/cgroup/mytest/cgroup.procs
>> >> ## Start profiling targeting cgroup /sys/fs/cgroup/mytest on another terminal
>> >> # fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --bs=4k --iodepth=64 --size=1G --readwrite=randrw --runtime=600 --numjobs=4 --time_based=1
>> >>
>> >> I got a feeling that f06cc667f7990 would decrease total samples by 10%~20% when profiling IO benchmark within cgroup.


>
>Then what is your profiling tool? Where did you see
>the 10%~20% drop in samples?
>

I wrote a simple/raw tool just for profiling callchains, which use perf_event_open with following attr:
attr.type = PERF_TYPE_SOFTWARE;
attr.config = PERF_COUNT_SW_CPU_CLOCK;
attr.sample_freq = 777; // adjust it
attr.freq = 1;
attr.wakeup_events = 16;
attr.sample_type = PERF_SAMPLE_TID|PERF_SAMPLE_CALLCHAIN;
attr.sample_max_stack = 32;

The source code could be found here: https://github.com/zq-david-wang/linux-tools/tree/main/perf/profiler
>>
>> I am not experienced with the perf-tool at all, too complicated a tool for me.... But I think I can try it.
>
>I feel sorry about that. In most cases, just `perf record -a` and
>then `perf report` would work well. :)
>
Thanks for the information, I use following command to profile with perf:
`./perf record -a -e cpu-clock -G mytest`
I have run several round of test, and before each test, the system was rebooted, and perf output is

On 6.7.0-rc1:
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 527 times to write data ]
[ perf record: Captured and wrote 132.648 MB perf.data (2478745 samples) ]
---reboot
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 473 times to write data ]
[ perf record: Captured and wrote 119.205 MB perf.data (2226994 samples) ]


On 6.7.0-rc1 with f06cc667f79909e9175460b167c277b7c64d3df0 reverted

$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 567 times to write data ]
[ perf record: Captured and wrote 142.771 MB perf.data (2668224 samples) ]
---reboot
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 557 times to write data ]
[ perf record: Captured and wrote 140.604 MB perf.data (2627167 samples) ]


I also run with `-F 777`, which is some random number I used in my tool, (just to compare with my tool )

On 6.7.0-rc1
$ sudo ./perf record -a -e cpu-clock -F 777 -G mytest
^C[ perf record: Woken up 93 times to write data ]
[ perf record: Captured and wrote 24.575 MB perf.data (455222 samples) ] ( My tool have only ~359K samples, not stable)


On 6.7.0-rc1 with f06cc667f79909e9175460b167c277b7c64d3df0 reverted
$ sudo ./perf record -a -e cpu-clock -F 777 -G mytest
^C[ perf record: Woken up 98 times to write data ]
[ perf record: Captured and wrote 25.703 MB perf.data (476390 samples) ] (My tool have about ~446K, stable)


From the data I collected, I think two problem could be observed for f06cc667f79909e9175460b167c277b7c64d3df0
1. sample missing.
2. sample unstable, total sample count drift a lot between tests.

Thanks
David