Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

From: kernel test robot
Date: Fri Oct 20 2023 - 12:18:25 EST




Hello,

kernel test robot noticed a -25.8% regression of will-it-scale.per_thread_ops on:


commit: 51d74c18a9c61e7ee33bc90b522dd7f6e5b80bb5 ("[PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg")
url: https://github.com/intel-lab-lkp/linux/commits/Yosry-Ahmed/mm-memcg-change-flush_next_time-to-flush_last_time/20231010-112257
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/all/20231010032117.1577496-4-yosryahmed@xxxxxxxxxx/
patch subject: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

nr_task: 100%
mode: thread
test: fallocate1
cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -30.0% regression |
| test machine | 104 threads 2 sockets (Skylake) with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=50% |
| | test=fallocate1 |
+------------------+---------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202310202303.c68e7639-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231020/202310202303.c68e7639-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/fallocate1/will-it-scale

commit:
130617edc1 ("mm: memcg: move vmstats structs definition above flushing code")
51d74c18a9 ("mm: memcg: make stats flushing threshold per-memcg")

130617edc1cd1ba1 51d74c18a9c61e7ee33bc90b522
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.09 -0.5 1.61 ± 2% mpstat.cpu.all.usr%
27.58 +3.7% 28.59 turbostat.RAMWatt
3324 -10.0% 2993 vmstat.system.cs
1056 -100.0% 0.00 numa-meminfo.node0.Inactive(file)
6.67 ±141% +15799.3% 1059 numa-meminfo.node1.Inactive(file)
120.83 ± 11% +79.6% 217.00 ± 9% perf-c2c.DRAM.local
594.50 ± 6% +43.8% 854.83 ± 5% perf-c2c.DRAM.remote
3797041 -25.8% 2816352 will-it-scale.104.threads
36509 -25.8% 27079 will-it-scale.per_thread_ops
3797041 -25.8% 2816352 will-it-scale.workload
1.142e+09 -26.2% 8.437e+08 numa-numastat.node0.local_node
1.143e+09 -26.1% 8.439e+08 numa-numastat.node0.numa_hit
1.148e+09 -25.4% 8.563e+08 ± 2% numa-numastat.node1.local_node
1.149e+09 -25.4% 8.564e+08 ± 2% numa-numastat.node1.numa_hit
32933 -2.6% 32068 proc-vmstat.nr_slab_reclaimable
2.291e+09 -25.8% 1.7e+09 proc-vmstat.numa_hit
2.291e+09 -25.8% 1.7e+09 proc-vmstat.numa_local
2.29e+09 -25.8% 1.699e+09 proc-vmstat.pgalloc_normal
2.289e+09 -25.8% 1.699e+09 proc-vmstat.pgfree
1.00 ± 93% +154.2% 2.55 ± 16% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
191.10 ± 2% +18.0% 225.55 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
385.50 ± 14% +39.6% 538.17 ± 12% perf-sched.wait_and_delay.count.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
118.67 ± 11% -62.6% 44.33 ±100% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
5043 ± 2% -13.0% 4387 ± 6% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
167.12 ±222% +200.1% 501.48 ± 99% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
191.09 ± 2% +18.0% 225.53 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
293.46 ± 4% +12.8% 330.98 ± 6% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
199.33 -100.0% 0.00 numa-vmstat.node0.nr_active_file
264.00 -100.0% 0.00 numa-vmstat.node0.nr_inactive_file
199.33 -100.0% 0.00 numa-vmstat.node0.nr_zone_active_file
264.00 -100.0% 0.00 numa-vmstat.node0.nr_zone_inactive_file
1.143e+09 -26.1% 8.439e+08 numa-vmstat.node0.numa_hit
1.142e+09 -26.2% 8.437e+08 numa-vmstat.node0.numa_local
1.67 ±141% +15799.3% 264.99 numa-vmstat.node1.nr_inactive_file
1.67 ±141% +15799.3% 264.99 numa-vmstat.node1.nr_zone_inactive_file
1.149e+09 -25.4% 8.564e+08 ± 2% numa-vmstat.node1.numa_hit
1.148e+09 -25.4% 8.563e+08 ± 2% numa-vmstat.node1.numa_local
0.59 ± 3% +125.2% 1.32 ± 2% perf-stat.i.MPKI
9.027e+09 -17.9% 7.408e+09 perf-stat.i.branch-instructions
0.64 -0.0 0.60 perf-stat.i.branch-miss-rate%
58102855 -23.3% 44580037 ± 2% perf-stat.i.branch-misses
15.28 +7.0 22.27 perf-stat.i.cache-miss-rate%
25155306 ± 2% +82.7% 45953601 ± 3% perf-stat.i.cache-misses
1.644e+08 +25.4% 2.062e+08 ± 2% perf-stat.i.cache-references
3258 -10.3% 2921 perf-stat.i.context-switches
6.73 +23.3% 8.30 perf-stat.i.cpi
145.97 -1.3% 144.13 perf-stat.i.cpu-migrations
11519 ± 3% -45.4% 6293 ± 3% perf-stat.i.cycles-between-cache-misses
0.04 -0.0 0.03 perf-stat.i.dTLB-load-miss-rate%
3921408 -25.3% 2929564 perf-stat.i.dTLB-load-misses
1.098e+10 -18.1% 8.993e+09 perf-stat.i.dTLB-loads
0.00 ± 2% +0.0 0.00 ± 4% perf-stat.i.dTLB-store-miss-rate%
5.606e+09 -23.2% 4.304e+09 perf-stat.i.dTLB-stores
95.65 -1.2 94.49 perf-stat.i.iTLB-load-miss-rate%
3876741 -25.0% 2905764 perf-stat.i.iTLB-load-misses
4.286e+10 -18.9% 3.477e+10 perf-stat.i.instructions
11061 +8.2% 11969 perf-stat.i.instructions-per-iTLB-miss
0.15 -18.9% 0.12 perf-stat.i.ipc
48.65 ± 2% +46.2% 71.11 ± 2% perf-stat.i.metric.K/sec
247.84 -18.9% 201.05 perf-stat.i.metric.M/sec
3138385 ± 2% +77.7% 5578401 ± 2% perf-stat.i.node-load-misses
375827 ± 3% +69.2% 635857 ± 11% perf-stat.i.node-loads
1343194 -26.8% 983668 perf-stat.i.node-store-misses
51550 ± 3% -19.0% 41748 ± 7% perf-stat.i.node-stores
0.59 ± 3% +125.1% 1.32 ± 2% perf-stat.overall.MPKI
0.64 -0.0 0.60 perf-stat.overall.branch-miss-rate%
15.30 +7.0 22.28 perf-stat.overall.cache-miss-rate%
6.73 +23.3% 8.29 perf-stat.overall.cpi
11470 ± 2% -45.3% 6279 ± 2% perf-stat.overall.cycles-between-cache-misses
0.04 -0.0 0.03 perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 2% +0.0 0.00 ± 4% perf-stat.overall.dTLB-store-miss-rate%
95.56 -1.4 94.17 perf-stat.overall.iTLB-load-miss-rate%
11059 +8.2% 11967 perf-stat.overall.instructions-per-iTLB-miss
0.15 -18.9% 0.12 perf-stat.overall.ipc
3396437 +9.5% 3718021 perf-stat.overall.path-length
8.997e+09 -17.9% 7.383e+09 perf-stat.ps.branch-instructions
57910417 -23.3% 44426577 ± 2% perf-stat.ps.branch-misses
25075498 ± 2% +82.7% 45803186 ± 3% perf-stat.ps.cache-misses
1.639e+08 +25.4% 2.056e+08 ± 2% perf-stat.ps.cache-references
3247 -10.3% 2911 perf-stat.ps.context-switches
145.47 -1.3% 143.61 perf-stat.ps.cpu-migrations
3908900 -25.3% 2920218 perf-stat.ps.dTLB-load-misses
1.094e+10 -18.1% 8.963e+09 perf-stat.ps.dTLB-loads
5.587e+09 -23.2% 4.289e+09 perf-stat.ps.dTLB-stores
3863663 -25.0% 2895895 perf-stat.ps.iTLB-load-misses
4.272e+10 -18.9% 3.466e+10 perf-stat.ps.instructions
3128132 ± 2% +77.7% 5559939 ± 2% perf-stat.ps.node-load-misses
375403 ± 3% +69.0% 634300 ± 11% perf-stat.ps.node-loads
1338688 -26.8% 980311 perf-stat.ps.node-store-misses
51546 ± 3% -19.1% 41692 ± 7% perf-stat.ps.node-stores
1.29e+13 -18.8% 1.047e+13 perf-stat.total.instructions
0.96 -0.3 0.70 ± 2% perf-profile.calltrace.cycles-pp.shmem_alloc_folio.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
0.97 -0.3 0.72 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.fallocate64
0.76 ± 2% -0.2 0.54 ± 3% perf-profile.calltrace.cycles-pp.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
0.82 -0.2 0.60 ± 2% perf-profile.calltrace.cycles-pp.alloc_pages_mpol.shmem_alloc_folio.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.91 -0.2 0.72 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
0.68 +0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru.shmem_alloc_and_add_folio.shmem_get_folio_gfp
1.67 +0.1 1.77 perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
1.78 ± 2% +0.1 1.92 ± 2% perf-profile.calltrace.cycles-pp.filemap_remove_folio.truncate_inode_folio.shmem_undo_range.shmem_setattr.notify_change
0.69 ± 5% +0.1 0.84 ± 4% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
1.56 ± 2% +0.2 1.76 ± 2% perf-profile.calltrace.cycles-pp.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio.shmem_undo_range.shmem_setattr
0.85 ± 4% +0.4 1.23 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.78 ± 4% +0.4 1.20 ± 3% perf-profile.calltrace.cycles-pp.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio.shmem_undo_range
0.73 ± 4% +0.4 1.17 ± 3% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio
48.39 +0.8 49.14 perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
0.00 +0.8 0.77 ± 4% perf-profile.calltrace.cycles-pp.mem_cgroup_commit_charge.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
40.24 +0.8 41.03 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.shmem_alloc_and_add_folio.shmem_get_folio_gfp
40.22 +0.8 41.01 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.shmem_alloc_and_add_folio
0.00 +0.8 0.79 ± 3% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp
40.19 +0.8 40.98 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru
1.33 ± 5% +0.8 2.13 ± 4% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
48.16 +0.8 48.98 perf-profile.calltrace.cycles-pp.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
0.00 +0.9 0.88 ± 2% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio
47.92 +0.9 48.81 perf-profile.calltrace.cycles-pp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe
47.07 +0.9 48.01 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
46.59 +1.1 47.64 perf-profile.calltrace.cycles-pp.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate
0.99 -0.3 0.73 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.96 -0.3 0.70 ± 2% perf-profile.children.cycles-pp.shmem_alloc_folio
0.78 ± 2% -0.2 0.56 ± 3% perf-profile.children.cycles-pp.shmem_inode_acct_blocks
0.83 -0.2 0.61 ± 2% perf-profile.children.cycles-pp.alloc_pages_mpol
0.92 -0.2 0.73 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.74 ± 2% -0.2 0.55 ± 2% perf-profile.children.cycles-pp.xas_store
0.67 -0.2 0.50 ± 3% perf-profile.children.cycles-pp.__alloc_pages
0.43 -0.1 0.31 ± 2% perf-profile.children.cycles-pp.__entry_text_start
0.41 ± 2% -0.1 0.30 ± 3% perf-profile.children.cycles-pp.free_unref_page_list
0.35 -0.1 0.25 ± 2% perf-profile.children.cycles-pp.xas_load
0.35 ± 2% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.__mod_lruvec_state
0.39 -0.1 0.30 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.27 ± 2% -0.1 0.19 ± 4% perf-profile.children.cycles-pp.__mod_node_page_state
0.32 ± 3% -0.1 0.24 ± 3% perf-profile.children.cycles-pp.find_lock_entries
0.23 ± 2% -0.1 0.15 ± 4% perf-profile.children.cycles-pp.xas_descend
0.28 ± 3% -0.1 0.20 ± 3% perf-profile.children.cycles-pp._raw_spin_lock
0.25 ± 3% -0.1 0.18 ± 3% perf-profile.children.cycles-pp.__dquot_alloc_space
0.16 ± 3% -0.1 0.10 ± 5% perf-profile.children.cycles-pp.xas_find_conflict
0.26 ± 2% -0.1 0.20 ± 3% perf-profile.children.cycles-pp.filemap_get_entry
0.26 -0.1 0.20 ± 2% perf-profile.children.cycles-pp.rmqueue
0.20 ± 3% -0.1 0.14 ± 3% perf-profile.children.cycles-pp.truncate_cleanup_folio
0.19 ± 5% -0.1 0.14 ± 4% perf-profile.children.cycles-pp.xas_clear_mark
0.17 ± 5% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.xas_init_marks
0.15 ± 4% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.free_unref_page_commit
0.18 ± 3% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.__cond_resched
0.07 ± 5% -0.0 0.02 ± 99% perf-profile.children.cycles-pp.xas_find
0.13 ± 2% -0.0 0.09 perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.14 ± 4% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.__fget_light
0.06 ± 6% -0.0 0.02 ± 99% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.12 ± 4% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.xas_start
0.08 ± 5% -0.0 0.05 perf-profile.children.cycles-pp.__folio_throttle_swaprate
0.12 -0.0 0.08 ± 5% perf-profile.children.cycles-pp.folio_unlock
0.14 ± 3% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.try_charge_memcg
0.12 ± 6% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.free_unref_page_prepare
0.12 ± 3% -0.0 0.09 ± 4% perf-profile.children.cycles-pp.noop_dirty_folio
0.20 ± 2% -0.0 0.17 ± 5% perf-profile.children.cycles-pp.page_counter_uncharge
0.10 -0.0 0.07 ± 5% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.09 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp._raw_spin_trylock
0.09 ± 5% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.inode_add_bytes
0.06 ± 6% -0.0 0.03 ± 70% perf-profile.children.cycles-pp.filemap_free_folio
0.06 ± 6% -0.0 0.03 ± 70% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.12 ± 3% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.__folio_cancel_dirty
0.12 ± 3% -0.0 0.10 ± 5% perf-profile.children.cycles-pp.shmem_recalc_inode
0.09 ± 5% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.__vm_enough_memory
0.08 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.08 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.security_file_permission
0.08 ± 6% -0.0 0.05 ± 7% perf-profile.children.cycles-pp.apparmor_file_permission
0.09 ± 4% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.__percpu_counter_limited_add
0.08 ± 6% -0.0 0.06 ± 8% perf-profile.children.cycles-pp.__list_add_valid_or_report
0.07 ± 8% -0.0 0.05 perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.14 ± 3% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.07 ± 5% -0.0 0.05 perf-profile.children.cycles-pp.policy_nodemask
0.24 ± 2% -0.0 0.22 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.08 -0.0 0.07 ± 7% perf-profile.children.cycles-pp.xas_create
0.69 +0.1 0.78 perf-profile.children.cycles-pp.lru_add_fn
1.72 ± 2% +0.1 1.80 perf-profile.children.cycles-pp.shmem_add_to_page_cache
1.79 ± 2% +0.1 1.93 ± 2% perf-profile.children.cycles-pp.filemap_remove_folio
0.13 ± 5% +0.1 0.28 perf-profile.children.cycles-pp.file_modified
0.69 ± 5% +0.1 0.84 ± 3% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
0.09 ± 7% +0.2 0.24 ± 2% perf-profile.children.cycles-pp.inode_needs_update_time
1.58 ± 3% +0.2 1.77 ± 2% perf-profile.children.cycles-pp.__filemap_remove_folio
0.15 ± 3% +0.4 0.50 ± 3% perf-profile.children.cycles-pp.__count_memcg_events
0.79 ± 4% +0.4 1.20 ± 3% perf-profile.children.cycles-pp.filemap_unaccount_folio
0.36 ± 5% +0.4 0.77 ± 4% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
98.33 +0.5 98.78 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
97.74 +0.6 98.34 perf-profile.children.cycles-pp.do_syscall_64
48.39 +0.8 49.15 perf-profile.children.cycles-pp.__x64_sys_fallocate
1.34 ± 5% +0.8 2.14 ± 4% perf-profile.children.cycles-pp.__mem_cgroup_charge
1.61 ± 4% +0.8 2.42 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state
48.17 +0.8 48.98 perf-profile.children.cycles-pp.vfs_fallocate
47.94 +0.9 48.82 perf-profile.children.cycles-pp.shmem_fallocate
47.10 +0.9 48.04 perf-profile.children.cycles-pp.shmem_get_folio_gfp
84.34 +0.9 85.28 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
84.31 +0.9 85.26 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
84.24 +1.0 85.21 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
46.65 +1.1 47.70 perf-profile.children.cycles-pp.shmem_alloc_and_add_folio
1.23 ± 4% +1.4 2.58 ± 2% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.98 -0.3 0.73 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.88 -0.2 0.70 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.60 -0.2 0.45 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.41 ± 3% -0.1 0.27 ± 3% perf-profile.self.cycles-pp.release_pages
0.41 -0.1 0.30 ± 3% perf-profile.self.cycles-pp.xas_store
0.41 ± 3% -0.1 0.29 ± 2% perf-profile.self.cycles-pp.folio_batch_move_lru
0.30 ± 3% -0.1 0.18 ± 5% perf-profile.self.cycles-pp.shmem_add_to_page_cache
0.38 ± 2% -0.1 0.27 ± 2% perf-profile.self.cycles-pp.__entry_text_start
0.30 ± 3% -0.1 0.20 ± 6% perf-profile.self.cycles-pp.lru_add_fn
0.28 ± 2% -0.1 0.20 ± 5% perf-profile.self.cycles-pp.shmem_fallocate
0.26 ± 2% -0.1 0.18 ± 5% perf-profile.self.cycles-pp.__mod_node_page_state
0.27 ± 3% -0.1 0.20 ± 2% perf-profile.self.cycles-pp._raw_spin_lock
0.21 ± 2% -0.1 0.15 ± 4% perf-profile.self.cycles-pp.__alloc_pages
0.20 ± 2% -0.1 0.14 ± 3% perf-profile.self.cycles-pp.xas_descend
0.26 ± 3% -0.1 0.20 ± 4% perf-profile.self.cycles-pp.find_lock_entries
0.18 ± 4% -0.0 0.13 ± 5% perf-profile.self.cycles-pp.xas_clear_mark
0.15 ± 7% -0.0 0.10 ± 11% perf-profile.self.cycles-pp.shmem_inode_acct_blocks
0.16 ± 4% -0.0 0.12 ± 4% perf-profile.self.cycles-pp.__dquot_alloc_space
0.13 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.free_unref_page_commit
0.13 -0.0 0.09 ± 5% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.16 ± 4% -0.0 0.12 ± 4% perf-profile.self.cycles-pp.shmem_alloc_and_add_folio
0.13 ± 5% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.__filemap_remove_folio
0.13 ± 2% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.get_page_from_freelist
0.12 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.vfs_fallocate
0.06 ± 7% -0.0 0.02 ± 99% perf-profile.self.cycles-pp.apparmor_file_permission
0.13 ± 3% -0.0 0.10 ± 5% perf-profile.self.cycles-pp.fallocate64
0.11 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.xas_start
0.07 ± 5% -0.0 0.03 ± 70% perf-profile.self.cycles-pp.shmem_alloc_folio
0.14 ± 4% -0.0 0.10 ± 7% perf-profile.self.cycles-pp.__fget_light
0.10 ± 4% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.rmqueue
0.12 ± 3% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.xas_load
0.11 ± 4% -0.0 0.08 ± 7% perf-profile.self.cycles-pp.folio_unlock
0.10 ± 4% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.alloc_pages_mpol
0.15 ± 2% -0.0 0.12 ± 5% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.10 -0.0 0.07 perf-profile.self.cycles-pp.cap_vm_enough_memory
0.16 ± 2% -0.0 0.13 ± 6% perf-profile.self.cycles-pp.page_counter_uncharge
0.12 ± 5% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.__cond_resched
0.06 ± 6% -0.0 0.03 ± 70% perf-profile.self.cycles-pp.filemap_free_folio
0.12 ± 3% -0.0 0.10 ± 5% perf-profile.self.cycles-pp.free_unref_page_list
0.12 -0.0 0.09 ± 4% perf-profile.self.cycles-pp.noop_dirty_folio
0.10 ± 3% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.filemap_remove_folio
0.10 ± 5% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.try_charge_memcg
0.12 ± 3% -0.0 0.10 ± 8% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.09 ± 4% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.__folio_cancel_dirty
0.08 ± 4% -0.0 0.06 ± 8% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.08 ± 5% -0.0 0.06 perf-profile.self.cycles-pp._raw_spin_trylock
0.08 -0.0 0.06 ± 6% perf-profile.self.cycles-pp.folio_add_lru
0.08 ± 8% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.__mod_lruvec_state
0.07 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.xas_find_conflict
0.08 ± 10% -0.0 0.06 ± 9% perf-profile.self.cycles-pp.truncate_cleanup_folio
0.07 ± 10% -0.0 0.05 perf-profile.self.cycles-pp.xas_init_marks
0.08 ± 4% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.__percpu_counter_limited_add
0.07 ± 7% -0.0 0.05 perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.07 ± 5% -0.0 0.06 ± 8% perf-profile.self.cycles-pp.__list_add_valid_or_report
0.02 ±141% +0.0 0.06 ± 8% perf-profile.self.cycles-pp.uncharge_batch
0.21 ± 9% +0.1 0.31 ± 7% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.69 ± 5% +0.1 0.83 ± 4% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.06 ± 6% +0.2 0.22 ± 2% perf-profile.self.cycles-pp.inode_needs_update_time
0.14 ± 8% +0.3 0.42 ± 7% perf-profile.self.cycles-pp.__mem_cgroup_charge
0.13 ± 7% +0.4 0.49 ± 3% perf-profile.self.cycles-pp.__count_memcg_events
84.24 +1.0 85.21 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.12 ± 5% +1.4 2.50 ± 2% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/fallocate1/will-it-scale

commit:
130617edc1 ("mm: memcg: move vmstats structs definition above flushing code")
51d74c18a9 ("mm: memcg: make stats flushing threshold per-memcg")

130617edc1cd1ba1 51d74c18a9c61e7ee33bc90b522
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.87 -0.4 1.43 ± 3% mpstat.cpu.all.usr%
3171 -5.3% 3003 ± 2% vmstat.system.cs
84.83 ± 9% +55.8% 132.17 ± 16% perf-c2c.DRAM.local
484.17 ± 3% +37.1% 663.67 ± 10% perf-c2c.DRAM.remote
72763 ± 5% +14.4% 83212 ± 12% turbostat.C1
0.08 -25.0% 0.06 turbostat.IPC
27.90 +4.6% 29.18 turbostat.RAMWatt
3982212 -30.0% 2785941 will-it-scale.52.threads
76580 -30.0% 53575 will-it-scale.per_thread_ops
3982212 -30.0% 2785941 will-it-scale.workload
1.175e+09 ± 2% -28.6% 8.392e+08 ± 2% numa-numastat.node0.local_node
1.175e+09 ± 2% -28.6% 8.394e+08 ± 2% numa-numastat.node0.numa_hit
1.231e+09 ± 2% -31.3% 8.463e+08 ± 3% numa-numastat.node1.local_node
1.232e+09 ± 2% -31.3% 8.466e+08 ± 3% numa-numastat.node1.numa_hit
1.175e+09 ± 2% -28.6% 8.394e+08 ± 2% numa-vmstat.node0.numa_hit
1.175e+09 ± 2% -28.6% 8.392e+08 ± 2% numa-vmstat.node0.numa_local
1.232e+09 ± 2% -31.3% 8.466e+08 ± 3% numa-vmstat.node1.numa_hit
1.231e+09 ± 2% -31.3% 8.463e+08 ± 3% numa-vmstat.node1.numa_local
2.408e+09 -30.0% 1.686e+09 proc-vmstat.numa_hit
2.406e+09 -30.0% 1.685e+09 proc-vmstat.numa_local
2.404e+09 -29.9% 1.684e+09 proc-vmstat.pgalloc_normal
2.404e+09 -29.9% 1.684e+09 proc-vmstat.pgfree
0.04 ± 9% -19.3% 0.03 ± 6% perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
0.04 ± 8% -22.3% 0.03 ± 5% perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_undo_range.shmem_setattr.notify_change.do_truncate
0.91 ± 2% +11.3% 1.01 ± 5% perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.04 ± 13% -90.3% 0.00 ±223% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
1.14 +15.1% 1.31 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
189.94 ± 3% +18.3% 224.73 ± 4% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1652 ± 4% -13.4% 1431 ± 4% perf-sched.wait_and_delay.count.__cond_resched.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
83.67 ± 7% -87.6% 10.33 ±223% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
3827 ± 4% -13.0% 3328 ± 3% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1.71 ±165% -83.4% 0.28 ± 21% perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
0.43 ± 17% -43.8% 0.24 ± 26% perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.46 ± 17% -36.7% 0.29 ± 12% perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_undo_range.shmem_setattr.notify_change.do_truncate
0.30 ± 34% -90.7% 0.03 ±223% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.04 ± 9% -19.3% 0.03 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
0.04 ± 8% -22.3% 0.03 ± 5% perf-sched.wait_time.avg.ms.__cond_resched.shmem_undo_range.shmem_setattr.notify_change.do_truncate
0.04 ± 11% -33.1% 0.03 ± 17% perf-sched.wait_time.avg.ms.__cond_resched.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.90 ± 2% +11.5% 1.00 ± 5% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.04 ± 13% -26.6% 0.03 ± 12% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
1.13 +15.2% 1.30 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
189.93 ± 3% +18.3% 224.72 ± 4% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1.71 ±165% -83.4% 0.28 ± 21% perf-sched.wait_time.max.ms.__cond_resched.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
0.43 ± 17% -43.8% 0.24 ± 26% perf-sched.wait_time.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.46 ± 17% -36.7% 0.29 ± 12% perf-sched.wait_time.max.ms.__cond_resched.shmem_undo_range.shmem_setattr.notify_change.do_truncate
0.75 +142.0% 1.83 ± 2% perf-stat.i.MPKI
8.47e+09 -24.4% 6.407e+09 perf-stat.i.branch-instructions
0.66 -0.0 0.63 perf-stat.i.branch-miss-rate%
56364992 -28.3% 40421603 ± 3% perf-stat.i.branch-misses
14.64 +6.7 21.30 perf-stat.i.cache-miss-rate%
30868184 +81.3% 55977240 ± 3% perf-stat.i.cache-misses
2.107e+08 +24.7% 2.627e+08 ± 2% perf-stat.i.cache-references
3106 -5.5% 2934 ± 2% perf-stat.i.context-switches
3.55 +33.4% 4.74 perf-stat.i.cpi
4722 -44.8% 2605 ± 3% perf-stat.i.cycles-between-cache-misses
0.04 -0.0 0.04 perf-stat.i.dTLB-load-miss-rate%
4117232 -29.1% 2917107 perf-stat.i.dTLB-load-misses
1.051e+10 -24.1% 7.979e+09 perf-stat.i.dTLB-loads
0.00 ± 3% +0.0 0.00 ± 6% perf-stat.i.dTLB-store-miss-rate%
5.886e+09 -27.5% 4.269e+09 perf-stat.i.dTLB-stores
78.16 -6.6 71.51 perf-stat.i.iTLB-load-miss-rate%
4131074 ± 3% -30.0% 2891515 perf-stat.i.iTLB-load-misses
4.098e+10 -25.0% 3.072e+10 perf-stat.i.instructions
9929 ± 2% +7.0% 10627 perf-stat.i.instructions-per-iTLB-miss
0.28 -25.0% 0.21 perf-stat.i.ipc
63.49 +43.8% 91.27 ± 3% perf-stat.i.metric.K/sec
241.12 -24.6% 181.87 perf-stat.i.metric.M/sec
3735316 +78.6% 6669641 ± 3% perf-stat.i.node-load-misses
377465 ± 4% +86.1% 702512 ± 11% perf-stat.i.node-loads
1322217 -27.6% 957081 ± 5% perf-stat.i.node-store-misses
37459 ± 3% -23.0% 28826 ± 5% perf-stat.i.node-stores
0.75 +141.8% 1.82 ± 2% perf-stat.overall.MPKI
0.67 -0.0 0.63 perf-stat.overall.branch-miss-rate%
14.65 +6.7 21.30 perf-stat.overall.cache-miss-rate%
3.55 +33.4% 4.73 perf-stat.overall.cpi
4713 -44.8% 2601 ± 3% perf-stat.overall.cycles-between-cache-misses
0.04 -0.0 0.04 perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 3% +0.0 0.00 ± 5% perf-stat.overall.dTLB-store-miss-rate%
78.14 -6.7 71.47 perf-stat.overall.iTLB-load-miss-rate%
9927 ± 2% +7.0% 10624 perf-stat.overall.instructions-per-iTLB-miss
0.28 -25.0% 0.21 perf-stat.overall.ipc
3098901 +7.1% 3318983 perf-stat.overall.path-length
8.441e+09 -24.4% 6.385e+09 perf-stat.ps.branch-instructions
56179581 -28.3% 40286337 ± 3% perf-stat.ps.branch-misses
30759982 +81.3% 55777812 ± 3% perf-stat.ps.cache-misses
2.1e+08 +24.6% 2.618e+08 ± 2% perf-stat.ps.cache-references
3095 -5.5% 2923 ± 2% perf-stat.ps.context-switches
4103292 -29.1% 2907270 perf-stat.ps.dTLB-load-misses
1.048e+10 -24.1% 7.952e+09 perf-stat.ps.dTLB-loads
5.866e+09 -27.5% 4.255e+09 perf-stat.ps.dTLB-stores
4117020 ± 3% -30.0% 2881750 perf-stat.ps.iTLB-load-misses
4.084e+10 -25.0% 3.062e+10 perf-stat.ps.instructions
3722149 +78.5% 6645867 ± 3% perf-stat.ps.node-load-misses
376240 ± 4% +86.1% 700053 ± 11% perf-stat.ps.node-loads
1317772 -27.6% 953773 ± 5% perf-stat.ps.node-store-misses
37408 ± 3% -23.2% 28748 ± 5% perf-stat.ps.node-stores
1.234e+13 -25.1% 9.246e+12 perf-stat.total.instructions
1.28 -0.4 0.90 ± 2% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.fallocate64
1.26 ± 2% -0.4 0.90 ± 3% perf-profile.calltrace.cycles-pp.shmem_alloc_folio.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
1.08 ± 2% -0.3 0.77 ± 3% perf-profile.calltrace.cycles-pp.alloc_pages_mpol.shmem_alloc_folio.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.92 ± 2% -0.3 0.62 ± 3% perf-profile.calltrace.cycles-pp.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
0.84 ± 3% -0.2 0.61 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages.alloc_pages_mpol.shmem_alloc_folio.shmem_alloc_and_add_folio.shmem_get_folio_gfp
1.26 -0.2 1.08 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.lru_add_drain_cpu.__folio_batch_release.shmem_undo_range.shmem_setattr
1.26 -0.2 1.08 perf-profile.calltrace.cycles-pp.lru_add_drain_cpu.__folio_batch_release.shmem_undo_range.shmem_setattr.notify_change
1.24 -0.2 1.06 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.lru_add_drain_cpu.__folio_batch_release.shmem_undo_range
1.24 -0.2 1.06 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.lru_add_drain_cpu.__folio_batch_release
1.23 -0.2 1.06 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.lru_add_drain_cpu
1.20 -0.2 1.04 ± 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
0.68 ± 3% +0.0 0.72 ± 4% perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge_list.release_pages.__folio_batch_release.shmem_undo_range.shmem_setattr
1.08 +0.1 1.20 perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru.shmem_alloc_and_add_folio.shmem_get_folio_gfp
2.91 +0.3 3.18 ± 2% perf-profile.calltrace.cycles-pp.truncate_inode_folio.shmem_undo_range.shmem_setattr.notify_change.do_truncate
2.56 +0.4 2.92 ± 2% perf-profile.calltrace.cycles-pp.filemap_remove_folio.truncate_inode_folio.shmem_undo_range.shmem_setattr.notify_change
1.36 ± 3% +0.4 1.76 ± 9% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
2.22 +0.5 2.68 ± 2% perf-profile.calltrace.cycles-pp.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio.shmem_undo_range.shmem_setattr
0.00 +0.6 0.60 ± 2% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.release_pages.__folio_batch_release.shmem_undo_range.shmem_setattr
2.33 +0.6 2.94 perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
0.00 +0.7 0.72 ± 2% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.lru_add_fn.folio_batch_move_lru.folio_add_lru.shmem_alloc_and_add_folio
0.69 ± 4% +0.8 1.47 ± 3% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio
1.24 ± 2% +0.8 2.04 ± 2% perf-profile.calltrace.cycles-pp.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio.shmem_undo_range
0.00 +0.8 0.82 ± 4% perf-profile.calltrace.cycles-pp.__count_memcg_events.mem_cgroup_commit_charge.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp
1.17 ± 2% +0.8 2.00 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.filemap_unaccount_folio.__filemap_remove_folio.filemap_remove_folio.truncate_inode_folio
0.59 ± 4% +0.9 1.53 perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp
1.38 +1.0 2.33 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
0.62 ± 3% +1.0 1.66 ± 5% perf-profile.calltrace.cycles-pp.mem_cgroup_commit_charge.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate
38.70 +1.2 39.90 perf-profile.calltrace.cycles-pp.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
38.34 +1.3 39.65 perf-profile.calltrace.cycles-pp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe
37.24 +1.6 38.86 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate.do_syscall_64
36.64 +1.8 38.40 perf-profile.calltrace.cycles-pp.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate.__x64_sys_fallocate
2.47 ± 2% +2.1 4.59 ± 8% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_fallocate.vfs_fallocate
1.30 -0.4 0.92 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
1.28 ± 2% -0.4 0.90 ± 3% perf-profile.children.cycles-pp.shmem_alloc_folio
1.10 ± 2% -0.3 0.78 ± 3% perf-profile.children.cycles-pp.alloc_pages_mpol
0.96 ± 2% -0.3 0.64 ± 3% perf-profile.children.cycles-pp.shmem_inode_acct_blocks
0.88 -0.3 0.58 ± 2% perf-profile.children.cycles-pp.xas_store
0.88 ± 3% -0.2 0.64 ± 3% perf-profile.children.cycles-pp.__alloc_pages
0.61 ± 2% -0.2 0.43 ± 3% perf-profile.children.cycles-pp.__entry_text_start
1.26 -0.2 1.09 perf-profile.children.cycles-pp.lru_add_drain_cpu
0.56 -0.2 0.39 ± 4% perf-profile.children.cycles-pp.free_unref_page_list
1.22 -0.2 1.06 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.46 -0.1 0.32 ± 3% perf-profile.children.cycles-pp.__mod_lruvec_state
0.41 ± 3% -0.1 0.28 ± 4% perf-profile.children.cycles-pp.xas_load
0.44 ± 4% -0.1 0.31 ± 4% perf-profile.children.cycles-pp.find_lock_entries
0.50 ± 3% -0.1 0.37 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.24 ± 7% -0.1 0.12 ± 5% perf-profile.children.cycles-pp.__list_add_valid_or_report
0.34 ± 2% -0.1 0.24 ± 4% perf-profile.children.cycles-pp.__mod_node_page_state
0.38 ± 3% -0.1 0.28 ± 4% perf-profile.children.cycles-pp._raw_spin_lock
0.32 ± 2% -0.1 0.22 ± 5% perf-profile.children.cycles-pp.__dquot_alloc_space
0.26 ± 2% -0.1 0.17 ± 2% perf-profile.children.cycles-pp.xas_descend
0.22 ± 3% -0.1 0.14 ± 4% perf-profile.children.cycles-pp.free_unref_page_commit
0.25 -0.1 0.17 ± 3% perf-profile.children.cycles-pp.xas_clear_mark
0.32 ± 4% -0.1 0.25 ± 3% perf-profile.children.cycles-pp.rmqueue
0.23 ± 2% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.xas_init_marks
0.24 ± 2% -0.1 0.17 ± 5% perf-profile.children.cycles-pp.__cond_resched
0.25 ± 4% -0.1 0.18 ± 2% perf-profile.children.cycles-pp.truncate_cleanup_folio
0.30 ± 3% -0.1 0.23 ± 4% perf-profile.children.cycles-pp.filemap_get_entry
0.20 ± 2% -0.1 0.13 ± 5% perf-profile.children.cycles-pp.folio_unlock
0.16 ± 4% -0.1 0.10 ± 5% perf-profile.children.cycles-pp.xas_find_conflict
0.19 ± 3% -0.1 0.13 ± 5% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.17 ± 5% -0.1 0.12 ± 3% perf-profile.children.cycles-pp.noop_dirty_folio
0.13 ± 4% -0.1 0.08 ± 9% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.18 ± 8% -0.1 0.13 ± 4% perf-profile.children.cycles-pp.shmem_recalc_inode
0.16 ± 2% -0.1 0.11 ± 3% perf-profile.children.cycles-pp.free_unref_page_prepare
0.09 ± 5% -0.1 0.04 ± 45% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.10 ± 7% -0.0 0.05 ± 45% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.14 ± 5% -0.0 0.10 perf-profile.children.cycles-pp.__folio_cancel_dirty
0.14 ± 5% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.security_file_permission
0.10 ± 5% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.xas_find
0.15 ± 4% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.__fget_light
0.14 ± 5% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.file_modified
0.12 ± 3% -0.0 0.09 ± 7% perf-profile.children.cycles-pp.__vm_enough_memory
0.12 ± 3% -0.0 0.09 ± 4% perf-profile.children.cycles-pp.apparmor_file_permission
0.12 ± 3% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.12 ± 4% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.xas_start
0.09 -0.0 0.06 ± 8% perf-profile.children.cycles-pp.__folio_throttle_swaprate
0.12 ± 6% -0.0 0.08 ± 8% perf-profile.children.cycles-pp._raw_spin_trylock
0.12 ± 4% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.__percpu_counter_limited_add
0.12 ± 4% -0.0 0.09 ± 4% perf-profile.children.cycles-pp.inode_add_bytes
0.20 ± 2% -0.0 0.17 ± 7% perf-profile.children.cycles-pp.try_charge_memcg
0.10 ± 5% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.policy_nodemask
0.09 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.09 ± 6% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.filemap_free_folio
0.07 ± 6% -0.0 0.05 ± 7% perf-profile.children.cycles-pp.down_write
0.08 ± 4% -0.0 0.06 ± 8% perf-profile.children.cycles-pp.get_task_policy
0.09 ± 5% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.xas_create
0.09 ± 7% -0.0 0.07 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.09 ± 7% -0.0 0.07 perf-profile.children.cycles-pp.inode_needs_update_time
0.16 ± 2% -0.0 0.14 ± 5% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.08 ± 7% -0.0 0.06 ± 9% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.07 ± 5% -0.0 0.05 ± 7% perf-profile.children.cycles-pp.folio_mark_dirty
0.08 ± 10% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.shmem_is_huge
0.07 ± 6% +0.0 0.09 ± 10% perf-profile.children.cycles-pp.propagate_protected_usage
0.43 ± 3% +0.0 0.46 ± 5% perf-profile.children.cycles-pp.uncharge_batch
0.68 ± 3% +0.0 0.73 ± 4% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
1.11 +0.1 1.22 perf-profile.children.cycles-pp.lru_add_fn
2.91 +0.3 3.18 ± 2% perf-profile.children.cycles-pp.truncate_inode_folio
2.56 +0.4 2.92 ± 2% perf-profile.children.cycles-pp.filemap_remove_folio
1.37 ± 3% +0.4 1.76 ± 9% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
2.24 +0.5 2.70 ± 2% perf-profile.children.cycles-pp.__filemap_remove_folio
2.38 +0.6 2.97 perf-profile.children.cycles-pp.shmem_add_to_page_cache
0.18 ± 4% +0.7 0.91 ± 4% perf-profile.children.cycles-pp.__count_memcg_events
1.26 +0.8 2.04 ± 2% perf-profile.children.cycles-pp.filemap_unaccount_folio
0.63 ± 2% +1.0 1.67 ± 5% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
38.71 +1.2 39.91 perf-profile.children.cycles-pp.vfs_fallocate
38.37 +1.3 39.66 perf-profile.children.cycles-pp.shmem_fallocate
37.28 +1.6 38.89 perf-profile.children.cycles-pp.shmem_get_folio_gfp
36.71 +1.7 38.45 perf-profile.children.cycles-pp.shmem_alloc_and_add_folio
2.58 +1.8 4.36 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state
2.48 ± 2% +2.1 4.60 ± 8% perf-profile.children.cycles-pp.__mem_cgroup_charge
1.93 ± 3% +2.4 4.36 ± 2% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.30 -0.4 0.92 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.73 -0.2 0.52 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.54 ± 2% -0.2 0.36 ± 3% perf-profile.self.cycles-pp.release_pages
0.48 -0.2 0.30 ± 3% perf-profile.self.cycles-pp.xas_store
0.54 ± 2% -0.2 0.38 ± 3% perf-profile.self.cycles-pp.__entry_text_start
1.17 -0.1 1.03 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.36 ± 2% -0.1 0.22 ± 3% perf-profile.self.cycles-pp.shmem_add_to_page_cache
0.43 ± 5% -0.1 0.30 ± 7% perf-profile.self.cycles-pp.lru_add_fn
0.24 ± 7% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.__list_add_valid_or_report
0.38 ± 4% -0.1 0.27 ± 4% perf-profile.self.cycles-pp._raw_spin_lock
0.52 ± 3% -0.1 0.41 perf-profile.self.cycles-pp.folio_batch_move_lru
0.32 ± 2% -0.1 0.22 ± 4% perf-profile.self.cycles-pp.__mod_node_page_state
0.36 ± 4% -0.1 0.26 ± 4% perf-profile.self.cycles-pp.find_lock_entries
0.36 ± 2% -0.1 0.26 ± 2% perf-profile.self.cycles-pp.shmem_fallocate
0.28 ± 3% -0.1 0.20 ± 5% perf-profile.self.cycles-pp.__alloc_pages
0.24 ± 2% -0.1 0.16 ± 4% perf-profile.self.cycles-pp.xas_descend
0.23 ± 2% -0.1 0.16 ± 3% perf-profile.self.cycles-pp.xas_clear_mark
0.18 ± 3% -0.1 0.11 ± 6% perf-profile.self.cycles-pp.free_unref_page_commit
0.18 ± 3% -0.1 0.12 ± 4% perf-profile.self.cycles-pp.shmem_inode_acct_blocks
0.21 ± 3% -0.1 0.15 ± 2% perf-profile.self.cycles-pp.shmem_alloc_and_add_folio
0.18 ± 2% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.__filemap_remove_folio
0.18 ± 7% -0.1 0.12 ± 7% perf-profile.self.cycles-pp.vfs_fallocate
0.20 ± 2% -0.1 0.14 ± 6% perf-profile.self.cycles-pp.__dquot_alloc_space
0.18 ± 2% -0.1 0.13 ± 3% perf-profile.self.cycles-pp.folio_unlock
0.18 ± 2% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.get_page_from_freelist
0.15 ± 3% -0.1 0.10 ± 7% perf-profile.self.cycles-pp.xas_load
0.17 ± 3% -0.1 0.12 ± 8% perf-profile.self.cycles-pp.__cond_resched
0.17 ± 2% -0.1 0.12 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.17 ± 5% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.noop_dirty_folio
0.10 ± 7% -0.0 0.05 ± 45% perf-profile.self.cycles-pp.cap_vm_enough_memory
0.12 ± 3% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.rmqueue
0.07 ± 5% -0.0 0.02 ± 99% perf-profile.self.cycles-pp.xas_find
0.13 ± 3% -0.0 0.09 ± 6% perf-profile.self.cycles-pp.alloc_pages_mpol
0.07 ± 6% -0.0 0.03 ± 70% perf-profile.self.cycles-pp.xas_find_conflict
0.16 ± 2% -0.0 0.12 ± 6% perf-profile.self.cycles-pp.free_unref_page_list
0.12 ± 5% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.fallocate64
0.20 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.06 ± 7% -0.0 0.02 ± 99% perf-profile.self.cycles-pp.shmem_recalc_inode
0.13 ± 3% -0.0 0.09 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.22 ± 3% -0.0 0.19 ± 6% perf-profile.self.cycles-pp.page_counter_uncharge
0.14 ± 3% -0.0 0.10 ± 6% perf-profile.self.cycles-pp.filemap_remove_folio
0.15 ± 5% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.__fget_light
0.12 ± 4% -0.0 0.08 perf-profile.self.cycles-pp.__folio_cancel_dirty
0.11 ± 4% -0.0 0.08 ± 7% perf-profile.self.cycles-pp._raw_spin_trylock
0.12 ± 3% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.__mod_lruvec_state
0.11 ± 5% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.truncate_cleanup_folio
0.11 ± 3% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.__percpu_counter_limited_add
0.11 ± 3% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.xas_start
0.10 ± 6% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.xas_init_marks
0.09 ± 6% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.11 -0.0 0.08 ± 5% perf-profile.self.cycles-pp.folio_add_lru
0.09 ± 6% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.filemap_free_folio
0.09 ± 4% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.shmem_alloc_folio
0.14 ± 5% -0.0 0.12 ± 5% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.10 ± 4% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.apparmor_file_permission
0.07 ± 7% -0.0 0.04 ± 44% perf-profile.self.cycles-pp.policy_nodemask
0.07 ± 11% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.shmem_is_huge
0.08 ± 4% -0.0 0.06 ± 8% perf-profile.self.cycles-pp.get_task_policy
0.08 ± 6% -0.0 0.05 ± 8% perf-profile.self.cycles-pp.__x64_sys_fallocate
0.12 ± 3% -0.0 0.10 ± 6% perf-profile.self.cycles-pp.try_charge_memcg
0.07 -0.0 0.05 perf-profile.self.cycles-pp.free_unref_page_prepare
0.07 ± 6% -0.0 0.06 ± 9% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.08 ± 4% -0.0 0.06 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.09 ± 7% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.filemap_get_entry
0.07 ± 9% +0.0 0.09 ± 10% perf-profile.self.cycles-pp.propagate_protected_usage
0.96 ± 2% +0.2 1.12 ± 7% perf-profile.self.cycles-pp.__mod_lruvec_page_state
0.45 ± 4% +0.4 0.82 ± 8% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
1.36 ± 3% +0.4 1.75 ± 9% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.29 +0.7 1.00 ± 10% perf-profile.self.cycles-pp.__mem_cgroup_charge
0.16 ± 4% +0.7 0.90 ± 4% perf-profile.self.cycles-pp.__count_memcg_events
1.80 ± 2% +2.5 4.26 ± 2% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki