[linus:master] [sched/eevdf] b01db23d59: hackbench.throughput -3.4% regression

From: kernel test robot
Date: Wed Oct 18 2023 - 10:55:31 EST




Hello,

kernel test robot noticed a -3.4% regression of hackbench.throughput on:


commit: b01db23d5923a35023540edc4f0c5f019e11ac7d ("sched/eevdf: Fix pick_eevdf()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: hackbench
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:

nr_threads: 50%
iterations: 4
mode: process
ipc: pipe
cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202310182229.78d950b2-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231018/202310182229.78d950b2-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
gcc-12/performance/pipe/4/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp2/hackbench

commit:
8dafa9d0eb ("sched/eevdf: Fix min_deadline heap integrity")
b01db23d59 ("sched/eevdf: Fix pick_eevdf()")

8dafa9d0eb1a1550 b01db23d5923a35023540edc4f0
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.958e+08 ± 4% +11.8% 5.544e+08 ± 9% cpuidle..time
47.00 ± 54% -74.1% 12.18 ± 87% turbostat.IPC
2127266 +10.3% 2347320 sched_debug.cpu.nr_switches.avg
36.94 ± 5% +10.7% 40.89 ± 3% sched_debug.cpu.nr_uninterruptible.stddev
8760010 +9.9% 9630735 vmstat.system.cs
1087178 +2.8% 1117923 vmstat.system.in
8690 ± 4% +108.2% 18091 ± 33% meminfo.Active
8034 ± 4% +117.0% 17435 ± 35% meminfo.Active(anon)
695914 ± 4% -10.0% 626568 ± 3% meminfo.Mapped
559.67 ± 71% +741.2% 4708 ± 11% perf-c2c.DRAM.remote
3119 ± 76% +972.2% 33448 ± 7% perf-c2c.HITM.local
236.00 ± 86% +240.7% 804.17 ± 19% perf-c2c.HITM.remote
3355 ± 76% +920.8% 34252 ± 7% perf-c2c.HITM.total
1052797 -3.4% 1016737 hackbench.throughput
1011957 -3.2% 979124 hackbench.throughput_avg
1052797 -3.4% 1016737 hackbench.throughput_best
917410 -3.0% 890080 hackbench.throughput_worst
99425414 +14.5% 1.138e+08 hackbench.time.involuntary_context_switches
6523 +3.2% 6728 hackbench.time.system_time
752.72 +2.6% 772.26 hackbench.time.user_time
4.408e+08 +13.9% 5.021e+08 hackbench.time.voluntary_context_switches
538.27 ± 56% +180.9% 1511 ± 52% numa-vmstat.node0.nr_active_anon
44253 ± 39% +609.8% 314098 ±115% numa-vmstat.node0.nr_inactive_anon
34617 ± 9% +83.5% 63505 ± 54% numa-vmstat.node0.nr_mapped
1543 ± 22% +16311.4% 253377 ±141% numa-vmstat.node0.nr_shmem
538.27 ± 56% +180.9% 1511 ± 52% numa-vmstat.node0.nr_zone_active_anon
44252 ± 39% +609.8% 314096 ±115% numa-vmstat.node0.nr_zone_inactive_anon
138022 ± 9% -32.6% 92993 ± 41% numa-vmstat.node1.nr_mapped
15216 ± 45% -52.7% 7199 ± 84% numa-vmstat.node1.nr_page_table_pages
2553 ± 49% +149.5% 6369 ± 50% numa-meminfo.node0.Active
2225 ± 58% +171.5% 6041 ± 53% numa-meminfo.node0.Active(anon)
177208 ± 38% +610.0% 1258121 ±115% numa-meminfo.node0.Inactive
177010 ± 39% +610.6% 1257927 ±115% numa-meminfo.node0.Inactive(anon)
138899 ± 9% +83.1% 254371 ± 54% numa-meminfo.node0.Mapped
6256 ± 23% +16125.5% 1015094 ±141% numa-meminfo.node0.Shmem
556497 ± 8% -33.2% 371757 ± 40% numa-meminfo.node1.Mapped
60668 ± 45% -52.7% 28716 ± 83% numa-meminfo.node1.PageTables
267338 ± 13% -23.1% 205462 ± 21% numa-meminfo.node1.Slab
2008 ± 5% +117.7% 4372 ± 35% proc-vmstat.nr_active_anon
173611 ± 5% -9.6% 157026 ± 3% proc-vmstat.nr_mapped
2008 ± 5% +117.7% 4372 ± 35% proc-vmstat.nr_zone_active_anon
1.243e+08 -1.8% 1.221e+08 proc-vmstat.numa_hit
1.242e+08 -1.8% 1.22e+08 proc-vmstat.numa_local
427.50 ± 26% +428.8% 2260 ± 15% proc-vmstat.pgactivate
1.244e+08 -1.8% 1.222e+08 proc-vmstat.pgalloc_normal
1.227e+08 -2.0% 1.202e+08 proc-vmstat.pgfree
550400 +3.7% 571008 proc-vmstat.unevictable_pgs_scanned
7.27 ± 68% -4.8 2.42 ±161% perf-profile.calltrace.cycles-pp.seq_read_iter.seq_read.vfs_read.ksys_read.do_syscall_64
7.27 ± 68% -4.8 2.42 ±161% perf-profile.calltrace.cycles-pp.seq_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.65 ± 77% -4.2 2.42 ±161% perf-profile.calltrace.cycles-pp.proc_single_show.seq_read_iter.seq_read.vfs_read.ksys_read
6.65 ± 77% -4.2 2.42 ±161% perf-profile.calltrace.cycles-pp.proc_pid_status.proc_single_show.seq_read_iter.seq_read.vfs_read
5.03 ±100% -4.1 0.93 ±223% perf-profile.calltrace.cycles-pp.number.vsnprintf.seq_printf.show_interrupts.seq_read_iter
2.08 ±223% +11.2 13.29 ± 53% perf-profile.calltrace.cycles-pp.perf_session__deliver_event.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events
2.78 ±223% +12.6 15.36 ± 54% perf-profile.calltrace.cycles-pp.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
2.08 ±223% +13.3 15.36 ± 54% perf-profile.calltrace.cycles-pp.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output
7.27 ± 68% -4.8 2.42 ±161% perf-profile.children.cycles-pp.seq_read
6.01 ± 75% -4.4 1.59 ±144% perf-profile.children.cycles-pp.number
6.65 ± 77% -4.2 2.42 ±161% perf-profile.children.cycles-pp.proc_single_show
6.65 ± 77% -4.2 2.42 ±161% perf-profile.children.cycles-pp.proc_pid_status
2.08 ±223% +11.2 13.29 ± 53% perf-profile.children.cycles-pp.perf_session__deliver_event
2.78 ±223% +12.6 15.36 ± 54% perf-profile.children.cycles-pp.perf_session__process_user_event
2.78 ±223% +12.6 15.36 ± 54% perf-profile.children.cycles-pp.__ordered_events__flush
6.01 ± 75% -4.4 1.59 ±144% perf-profile.self.cycles-pp.number
0.39 +0.0 0.42 perf-stat.i.branch-miss-rate%
1.983e+08 +5.2% 2.085e+08 perf-stat.i.branch-misses
9200253 +10.1% 10127041 perf-stat.i.context-switches
554899 +5.3% 584042 perf-stat.i.cpu-migrations
0.04 +0.0 0.05 perf-stat.i.dTLB-load-miss-rate%
33430860 +10.3% 36871351 perf-stat.i.dTLB-load-misses
0.85 -0.9% 0.84 perf-stat.i.ipc
9.94 ± 16% -26.2% 7.34 ± 24% perf-stat.i.major-faults
4603385 ± 3% +6.8% 4916292 ± 3% perf-stat.i.node-stores
0.39 +0.0 0.41 perf-stat.overall.branch-miss-rate%
0.04 +0.0 0.05 perf-stat.overall.dTLB-load-miss-rate%
1.947e+08 +5.3% 2.051e+08 perf-stat.ps.branch-misses
9028654 +10.2% 9953106 perf-stat.ps.context-switches
543976 +5.4% 573599 perf-stat.ps.cpu-migrations
32914153 +10.4% 36349710 perf-stat.ps.dTLB-load-misses
4512147 ± 3% +7.0% 4827523 ± 3% perf-stat.ps.node-stores
1.588e+13 +3.3% 1.64e+13 perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki