Re: [lkp] [huge tmpfs] d7c7d56ca6: vm-scalability.throughput -5.5% regression

From: Hugh Dickins
Date: Wed Apr 13 2016 - 01:30:51 EST


On Wed, 13 Apr 2016, kernel test robot wrote:

> FYI, we noticed that vm-scalability.throughput -5.5% regression on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit d7c7d56ca61aec18e5e0cb3a64e50073c42195f7 ("huge tmpfs: avoid premature exposure of new pagetable")

Very useful info, thank you. I presume it confirms exactly what Kirill
warned me of, that doing the map_pages after instead of before the fault,
comes with a performance disadvantage. I shall look into it, but not
immediately (and we know other reasons why that patch has to be revisited).

Hugh

>
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
> gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/300s/lkp-hsw01/lru-file-mmap-read-rand/vm-scalability
>
> commit:
> 517348161d2725b8b596feb10c813bf596dc6a47
> d7c7d56ca61aec18e5e0cb3a64e50073c42195f7
>
> 517348161d2725b8 d7c7d56ca61aec18e5e0cb3a64
> ---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> 1801726 ± 0% -5.5% 1702808 ± 0% vm-scalability.throughput
> 317.89 ± 0% +2.9% 327.15 ± 0% vm-scalability.time.elapsed_time
> 317.89 ± 0% +2.9% 327.15 ± 0% vm-scalability.time.elapsed_time.max
> 872240 ± 4% +8.5% 946467 ± 1% vm-scalability.time.involuntary_context_switches
> 6.73e+08 ± 0% -92.5% 50568722 ± 0% vm-scalability.time.major_page_faults
> 2109093 ± 9% -25.8% 1564815 ± 7% vm-scalability.time.maximum_resident_set_size
> 37881 ± 0% +586.9% 260194 ± 0% vm-scalability.time.minor_page_faults
> 5087 ± 0% +3.7% 5277 ± 0% vm-scalability.time.percent_of_cpu_this_job_got
> 16047 ± 0% +7.5% 17252 ± 0% vm-scalability.time.system_time
> 127.19 ± 0% -88.3% 14.93 ± 1% vm-scalability.time.user_time
> 72572 ± 7% +56.0% 113203 ± 3% cpuidle.C1-HSW.usage
> 9.879e+08 ± 4% -32.5% 6.67e+08 ± 8% cpuidle.C6-HSW.time
> 605545 ± 3% -12.9% 527295 ± 1% softirqs.RCU
> 164170 ± 7% +20.5% 197881 ± 6% softirqs.SCHED
> 2584429 ± 3% -25.5% 1925241 ± 2% vmstat.memory.free
> 252507 ± 0% +36.2% 343994 ± 0% vmstat.system.in
> 2.852e+08 ± 5% +163.9% 7.527e+08 ± 1% numa-numastat.node0.local_node
> 2.852e+08 ± 5% +163.9% 7.527e+08 ± 1% numa-numastat.node0.numa_hit
> 2.876e+08 ± 6% +162.8% 7.559e+08 ± 0% numa-numastat.node1.local_node
> 2.876e+08 ± 6% +162.8% 7.559e+08 ± 0% numa-numastat.node1.numa_hit
> 6.73e+08 ± 0% -92.5% 50568722 ± 0% time.major_page_faults
> 2109093 ± 9% -25.8% 1564815 ± 7% time.maximum_resident_set_size
> 37881 ± 0% +586.9% 260194 ± 0% time.minor_page_faults
> 127.19 ± 0% -88.3% 14.93 ± 1% time.user_time
> 94.37 ± 0% +2.0% 96.27 ± 0% turbostat.%Busy
> 2919 ± 0% +2.0% 2977 ± 0% turbostat.Avg_MHz
> 5.12 ± 4% -38.7% 3.14 ± 5% turbostat.CPU%c6
> 2.00 ± 13% -44.8% 1.10 ± 22% turbostat.Pkg%pc2
> 240.00 ± 0% +4.2% 250.14 ± 0% turbostat.PkgWatt
> 55.36 ± 3% +16.3% 64.40 ± 2% turbostat.RAMWatt
> 17609 ±103% -59.4% 7148 ± 72% latency_stats.avg.pipe_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
> 63966 ±152% -68.4% 20204 ± 64% latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
> 299681 ±123% -89.7% 30889 ± 13% latency_stats.max.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
> 0.00 ± -1% +Inf% 35893 ± 10% latency_stats.max.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
> 90871 ±125% -56.2% 39835 ± 74% latency_stats.sum.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
> 61821 ± 22% -86.6% 8254 ± 62% latency_stats.sum.sigsuspend.SyS_rt_sigsuspend.entry_SYSCALL_64_fastpath
> 0.00 ± -1% +Inf% 59392 ±118% latency_stats.sum.throttle_direct_reclaim.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault
> 0.00 ± -1% +Inf% 1549096 ± 24% latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
> 639.30 ± 8% -38.8% 391.40 ± 6% slabinfo.RAW.active_objs
> 639.30 ± 8% -38.8% 391.40 ± 6% slabinfo.RAW.num_objs
> 555.90 ± 14% -50.7% 274.10 ± 36% slabinfo.nfs_commit_data.active_objs
> 555.90 ± 14% -50.7% 274.10 ± 36% slabinfo.nfs_commit_data.num_objs
> 10651978 ± 0% -80.0% 2126718 ± 0% slabinfo.radix_tree_node.active_objs
> 218915 ± 0% -81.9% 39535 ± 0% slabinfo.radix_tree_node.active_slabs
> 12259274 ± 0% -81.9% 2213762 ± 0% slabinfo.radix_tree_node.num_objs
> 218915 ± 0% -81.9% 39535 ± 0% slabinfo.radix_tree_node.num_slabs
> 8503640 ± 1% -87.8% 1038681 ± 0% meminfo.Active
> 8155208 ± 1% -91.5% 692744 ± 0% meminfo.Active(file)
> 47732497 ± 0% +13.9% 54365008 ± 0% meminfo.Cached
> 38794624 ± 0% +36.4% 52899738 ± 0% meminfo.Inactive
> 38748440 ± 0% +36.4% 52853183 ± 0% meminfo.Inactive(file)
> 45315491 ± 0% -24.0% 34459599 ± 0% meminfo.Mapped
> 2693407 ± 5% -30.7% 1867438 ± 3% meminfo.MemFree
> 7048370 ± 0% -81.5% 1303216 ± 0% meminfo.SReclaimable
> 7145508 ± 0% -80.4% 1400313 ± 0% meminfo.Slab
> 4168849 ± 2% -88.1% 496040 ± 27% numa-meminfo.node0.Active
> 3987391 ± 1% -91.3% 346768 ± 0% numa-meminfo.node0.Active(file)
> 23809283 ± 0% +13.8% 27087077 ± 0% numa-meminfo.node0.FilePages
> 19423374 ± 0% +35.8% 26379857 ± 0% numa-meminfo.node0.Inactive
> 19402281 ± 0% +35.8% 26356354 ± 0% numa-meminfo.node0.Inactive(file)
> 22594121 ± 0% -24.1% 17153129 ± 0% numa-meminfo.node0.Mapped
> 1430871 ± 5% -31.2% 984861 ± 2% numa-meminfo.node0.MemFree
> 3457483 ± 1% -81.4% 642147 ± 0% numa-meminfo.node0.SReclaimable
> 3507005 ± 1% -80.3% 692577 ± 0% numa-meminfo.node0.Slab
> 4349443 ± 3% -87.5% 543711 ± 24% numa-meminfo.node1.Active
> 4181422 ± 3% -91.7% 346861 ± 1% numa-meminfo.node1.Active(file)
> 23896184 ± 0% +14.2% 27287954 ± 0% numa-meminfo.node1.FilePages
> 19329324 ± 0% +37.2% 26528591 ± 0% numa-meminfo.node1.Inactive
> 19304364 ± 0% +37.3% 26505692 ± 0% numa-meminfo.node1.Inactive(file)
> 22671758 ± 0% -23.7% 17303673 ± 0% numa-meminfo.node1.Mapped
> 1299430 ± 7% -32.8% 873435 ± 6% numa-meminfo.node1.MemFree
> 3589265 ± 1% -81.6% 661650 ± 0% numa-meminfo.node1.SReclaimable
> 3636880 ± 1% -80.5% 708315 ± 0% numa-meminfo.node1.Slab
> 994864 ± 1% -91.3% 86711 ± 0% numa-vmstat.node0.nr_active_file
> 5952715 ± 0% +13.8% 6773427 ± 0% numa-vmstat.node0.nr_file_pages
> 356982 ± 5% -31.5% 244513 ± 3% numa-vmstat.node0.nr_free_pages
> 4853127 ± 0% +35.8% 6590709 ± 0% numa-vmstat.node0.nr_inactive_file
> 394.70 ± 15% -62.9% 146.60 ± 32% numa-vmstat.node0.nr_isolated_file
> 5649360 ± 0% -24.1% 4288873 ± 0% numa-vmstat.node0.nr_mapped
> 28030 ± 53% -97.7% 648.30 ± 10% numa-vmstat.node0.nr_pages_scanned
> 864516 ± 1% -81.4% 160512 ± 0% numa-vmstat.node0.nr_slab_reclaimable
> 1.522e+08 ± 4% +155.9% 3.893e+08 ± 1% numa-vmstat.node0.numa_hit
> 1.521e+08 ± 4% +155.9% 3.893e+08 ± 1% numa-vmstat.node0.numa_local
> 217926 ± 3% -84.4% 33949 ± 2% numa-vmstat.node0.workingset_activate
> 60138428 ± 2% -72.5% 16533446 ± 0% numa-vmstat.node0.workingset_nodereclaim
> 4367580 ± 3% +158.4% 11285489 ± 1% numa-vmstat.node0.workingset_refault
> 1043245 ± 3% -91.7% 86749 ± 1% numa-vmstat.node1.nr_active_file
> 5974941 ± 0% +14.2% 6823255 ± 0% numa-vmstat.node1.nr_file_pages
> 323798 ± 7% -33.0% 216945 ± 5% numa-vmstat.node1.nr_free_pages
> 4829122 ± 1% +37.2% 6627644 ± 0% numa-vmstat.node1.nr_inactive_file
> 395.80 ± 8% -68.5% 124.80 ± 46% numa-vmstat.node1.nr_isolated_file
> 5669082 ± 0% -23.7% 4326551 ± 0% numa-vmstat.node1.nr_mapped
> 32004 ± 60% -99.9% 47.00 ± 9% numa-vmstat.node1.nr_pages_scanned
> 897351 ± 1% -81.6% 165406 ± 0% numa-vmstat.node1.nr_slab_reclaimable
> 1.535e+08 ± 4% +154.6% 3.909e+08 ± 0% numa-vmstat.node1.numa_hit
> 1.535e+08 ± 4% +154.7% 3.909e+08 ± 0% numa-vmstat.node1.numa_local
> 235134 ± 5% -85.7% 33507 ± 2% numa-vmstat.node1.workingset_activate
> 59647268 ± 1% -72.1% 16626347 ± 0% numa-vmstat.node1.workingset_nodereclaim
> 4535102 ± 4% +151.1% 11389137 ± 0% numa-vmstat.node1.workingset_refault
> 347641 ± 13% +97.0% 684832 ± 0% proc-vmstat.allocstall
> 7738 ± 9% +236.5% 26042 ± 0% proc-vmstat.kswapd_low_wmark_hit_quickly
> 2041367 ± 1% -91.5% 173206 ± 0% proc-vmstat.nr_active_file
> 1233230 ± 0% +11.7% 1378011 ± 0% proc-vmstat.nr_dirty_background_threshold
> 2466460 ± 0% +11.7% 2756024 ± 0% proc-vmstat.nr_dirty_threshold
> 11933740 ± 0% +13.9% 13594909 ± 0% proc-vmstat.nr_file_pages
> 671934 ± 5% -31.1% 463093 ± 3% proc-vmstat.nr_free_pages
> 9685062 ± 0% +36.5% 13216819 ± 0% proc-vmstat.nr_inactive_file
> 792.80 ± 10% -67.9% 254.20 ± 34% proc-vmstat.nr_isolated_file
> 11327952 ± 0% -23.9% 8616859 ± 0% proc-vmstat.nr_mapped
> 73994 ± 51% -99.1% 657.00 ± 7% proc-vmstat.nr_pages_scanned
> 1762423 ± 0% -81.5% 325807 ± 0% proc-vmstat.nr_slab_reclaimable
> 72.30 ± 23% +852.4% 688.60 ± 58% proc-vmstat.nr_vmscan_immediate_reclaim
> 5392 ± 2% -11.9% 4750 ± 2% proc-vmstat.numa_hint_faults
> 5.728e+08 ± 5% +163.4% 1.509e+09 ± 0% proc-vmstat.numa_hit
> 5.728e+08 ± 5% +163.4% 1.509e+09 ± 0% proc-vmstat.numa_local
> 5638 ± 4% -12.5% 4935 ± 3% proc-vmstat.numa_pte_updates
> 8684 ± 8% +215.8% 27427 ± 0% proc-vmstat.pageoutrun
> 3220941 ± 0% -90.2% 315751 ± 0% proc-vmstat.pgactivate
> 17739240 ± 1% +143.6% 43217427 ± 0% proc-vmstat.pgalloc_dma32
> 6.6e+08 ± 0% +138.1% 1.572e+09 ± 0% proc-vmstat.pgalloc_normal
> 6.737e+08 ± 0% -92.4% 51517407 ± 0% proc-vmstat.pgfault
> 6.767e+08 ± 0% +138.5% 1.614e+09 ± 0% proc-vmstat.pgfree
> 6.73e+08 ± 0% -92.5% 50568722 ± 0% proc-vmstat.pgmajfault
> 31567471 ± 1% +91.6% 60472288 ± 0% proc-vmstat.pgscan_direct_dma32
> 1.192e+09 ± 2% +84.5% 2.199e+09 ± 0% proc-vmstat.pgscan_direct_normal
> 16309661 ± 0% +150.4% 40841573 ± 0% proc-vmstat.pgsteal_direct_dma32
> 6.151e+08 ± 0% +140.8% 1.481e+09 ± 0% proc-vmstat.pgsteal_direct_normal
> 939746 ± 18% +101.3% 1891322 ± 6% proc-vmstat.pgsteal_kswapd_dma32
> 27432476 ± 4% +162.4% 71970660 ± 2% proc-vmstat.pgsteal_kswapd_normal
> 4.802e+08 ± 5% -81.5% 88655347 ± 0% proc-vmstat.slabs_scanned
> 452671 ± 2% -85.1% 67360 ± 1% proc-vmstat.workingset_activate
> 1.198e+08 ± 1% -72.4% 33135682 ± 0% proc-vmstat.workingset_nodereclaim
> 8898128 ± 1% +154.6% 22657102 ± 0% proc-vmstat.workingset_refault
> 613962 ± 12% -18.6% 499880 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev
> 31.47 ± 38% +203.5% 95.52 ± 29% sched_debug.cfs_rq:/.nr_spread_over.max
> 6.19 ± 32% +150.9% 15.53 ± 24% sched_debug.cfs_rq:/.nr_spread_over.stddev
> 41.71 ± 51% -42.3% 24.07 ± 12% sched_debug.cfs_rq:/.runnable_load_avg.avg
> 1094 ±106% -60.9% 427.95 ± 25% sched_debug.cfs_rq:/.runnable_load_avg.max
> 163.22 ± 92% -63.2% 60.09 ± 28% sched_debug.cfs_rq:/.runnable_load_avg.stddev
> 613932 ± 12% -18.6% 499833 ± 9% sched_debug.cfs_rq:/.spread0.stddev
> 35.20 ± 8% -29.1% 24.97 ± 11% sched_debug.cpu.cpu_load[0].avg
> 731.80 ± 11% -36.1% 467.45 ± 21% sched_debug.cpu.cpu_load[0].max
> 116.23 ± 10% -43.5% 65.72 ± 23% sched_debug.cpu.cpu_load[0].stddev
> 35.25 ± 8% -25.6% 26.23 ± 10% sched_debug.cpu.cpu_load[1].avg
> 722.47 ± 10% -30.2% 504.05 ± 18% sched_debug.cpu.cpu_load[1].max
> 115.25 ± 10% -38.5% 70.82 ± 19% sched_debug.cpu.cpu_load[1].stddev
> 35.37 ± 8% -22.4% 27.45 ± 8% sched_debug.cpu.cpu_load[2].avg
> 721.90 ± 9% -27.7% 521.60 ± 16% sched_debug.cpu.cpu_load[2].max
> 10.85 ± 14% +16.9% 12.68 ± 6% sched_debug.cpu.cpu_load[2].min
> 114.93 ± 9% -35.1% 74.62 ± 16% sched_debug.cpu.cpu_load[2].stddev
> 35.20 ± 8% -21.3% 27.70 ± 5% sched_debug.cpu.cpu_load[3].avg
> 705.73 ± 9% -29.6% 496.57 ± 13% sched_debug.cpu.cpu_load[3].max
> 10.95 ± 13% +18.7% 13.00 ± 4% sched_debug.cpu.cpu_load[3].min
> 112.58 ± 9% -34.8% 73.35 ± 12% sched_debug.cpu.cpu_load[3].stddev
> 34.96 ± 8% -21.7% 27.39 ± 5% sched_debug.cpu.cpu_load[4].avg
> 684.63 ± 10% -32.0% 465.83 ± 11% sched_debug.cpu.cpu_load[4].max
> 11.10 ± 12% +17.7% 13.07 ± 3% sched_debug.cpu.cpu_load[4].min
> 110.03 ± 9% -36.1% 70.28 ± 10% sched_debug.cpu.cpu_load[4].stddev
> 293.58 ± 28% +110.8% 618.85 ± 32% sched_debug.cpu.curr->pid.min
> 18739 ± 3% +10.5% 20713 ± 1% sched_debug.cpu.nr_switches.avg
> 33332 ± 10% +21.0% 40337 ± 6% sched_debug.cpu.nr_switches.max
> 4343 ± 10% +34.8% 5852 ± 8% sched_debug.cpu.nr_switches.stddev
> 19363 ± 3% +9.2% 21136 ± 1% sched_debug.cpu.sched_count.avg
> 20.35 ± 17% -31.5% 13.93 ± 22% sched_debug.cpu.sched_goidle.min
> 9245 ± 3% +12.5% 10398 ± 0% sched_debug.cpu.ttwu_count.avg
> 16837 ± 10% +27.0% 21390 ± 8% sched_debug.cpu.ttwu_count.max
> 2254 ± 8% +39.5% 3143 ± 8% sched_debug.cpu.ttwu_count.stddev
> 8052 ± 4% +16.2% 9353 ± 0% sched_debug.cpu.ttwu_local.avg
> 5846 ± 4% +11.0% 6491 ± 2% sched_debug.cpu.ttwu_local.min
> 1847 ± 11% +39.8% 2582 ± 8% sched_debug.cpu.ttwu_local.stddev
> 3.66 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault
> 0.00 ± -1% +Inf% 1.12 ± 0% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
> 0.00 ± -1% +Inf% 77.72 ± 0% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault
> 79.28 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault.xfs_filemap_fault
> 11.43 ± 5% -89.4% 1.21 ± 4% perf-profile.cycles-pp.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 0.00 ± -1% +Inf% 96.93 ± 0% perf-profile.cycles-pp.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault
> 91.04 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault
> 0.00 ± -1% +Inf% 96.66 ± 0% perf-profile.cycles-pp.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault.do_fault
> 29.86 ± 3% -96.9% 0.92 ± 19% perf-profile.cycles-pp.__list_lru_walk_one.isra.3.list_lru_walk_one.scan_shadow_nodes.shrink_slab.shrink_zone
> 1.59 ± 14% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault
> 0.00 ± -1% +Inf% 5.67 ± 5% perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages
> 0.00 ± -1% +Inf% 78.11 ± 0% perf-profile.cycles-pp.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault
> 79.40 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__page_cache_alloc.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
> 1.28 ± 4% -38.7% 0.78 ± 1% perf-profile.cycles-pp.__radix_tree_lookup.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list
> 25.30 ± 6% -84.2% 3.99 ± 5% perf-profile.cycles-pp.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
> 0.56 ± 0% +98.2% 1.11 ± 0% perf-profile.cycles-pp.__rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
> 0.00 ± -1% +Inf% 1.11 ± 0% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages
> 0.01 ±133% +30254.3% 2.66 ± 8% perf-profile.cycles-pp._raw_spin_lock.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list
> 5.07 ± 25% +268.7% 18.71 ± 3% perf-profile.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
> 9.16 ± 6% -100.0% 0.00 ± -1% perf-profile.cycles-pp._raw_spin_lock.list_lru_add.__delete_from_page_cache.__remove_mapping.shrink_page_list
> 0.69 ± 64% -100.0% 0.00 ± -1% perf-profile.cycles-pp._raw_spin_lock.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault
> 27.69 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp._raw_spin_lock.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one.scan_shadow_nodes
> 10.77 ± 10% +238.5% 36.45 ± 1% perf-profile.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
> 0.35 ± 9% +193.4% 1.02 ± 13% perf-profile.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone.kswapd
> 12.86 ± 9% -89.4% 1.36 ± 17% perf-profile.cycles-pp._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 1.11 ± 18% +333.5% 4.83 ± 6% perf-profile.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru
> 5.38 ± 5% -100.0% 0.00 ± -1% perf-profile.cycles-pp.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
> 0.00 ± -1% +Inf% 7.15 ± 4% perf-profile.cycles-pp.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault
> 0.00 ± -1% +Inf% 78.06 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault
> 79.38 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.filemap_fault.xfs_filemap_fault.__do_fault
> 0.00 ± -1% +Inf% 97.32 ± 0% perf-profile.cycles-pp.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
> 5.19 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.do_mpage_readpage.mpage_readpage.xfs_vm_readpage.filemap_fault.xfs_filemap_fault
> 0.00 ± -1% +Inf% 10.68 ± 1% perf-profile.cycles-pp.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault
> 0.72 ± 67% -98.3% 0.01 ± 87% perf-profile.cycles-pp.do_syscall_64.return_from_SYSCALL_64.__libc_fork
> 72.75 ± 1% -23.2% 55.88 ± 1% perf-profile.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
> 0.00 ± -1% +Inf% 96.86 ± 0% perf-profile.cycles-pp.filemap_fault.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault
> 90.80 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault
> 2.39 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.filemap_map_pages.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault
> 0.97 ± 12% +321.3% 4.07 ± 6% perf-profile.cycles-pp.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 1.03 ± 9% +303.4% 4.17 ± 5% perf-profile.cycles-pp.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
> 0.65 ± 23% +451.9% 3.58 ± 6% perf-profile.cycles-pp.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list
> 0.00 ± -1% +Inf% 21.18 ± 2% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead
> 6.22 ± 21% -100.0% 0.00 ± -1% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault
> 94.07 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
> 0.57 ± 1% +104.6% 1.16 ± 2% perf-profile.cycles-pp.isolate_lru_pages.isra.47.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
> 2.96 ± 7% -30.5% 2.05 ± 9% perf-profile.cycles-pp.kthread.ret_from_fork
> 9.58 ± 6% -100.0% 0.00 ±229% perf-profile.cycles-pp.list_lru_add.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list
> 1.88 ± 6% -100.0% 0.00 ± -1% perf-profile.cycles-pp.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault
> 29.08 ± 3% -97.0% 0.89 ± 19% perf-profile.cycles-pp.list_lru_walk_one.scan_shadow_nodes.shrink_slab.shrink_zone.do_try_to_free_pages
> 1.59 ± 14% -100.0% 0.00 ± -1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault
> 0.00 ± -1% +Inf% 5.68 ± 5% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
> 5.24 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.mpage_readpage.xfs_vm_readpage.filemap_fault.xfs_filemap_fault.__do_fault
> 0.00 ± -1% +Inf% 18.20 ± 1% perf-profile.cycles-pp.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault
> 2.37 ± 14% +79.9% 4.27 ± 13% perf-profile.cycles-pp.native_flush_tlb_others.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 0.01 ±133% +30322.9% 2.66 ± 8% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list
> 5.07 ± 25% +268.8% 18.71 ± 3% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current
> 9.16 ± 6% -100.0% 0.00 ± -1% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.list_lru_add.__delete_from_page_cache.__remove_mapping
> 0.75 ± 57% -100.0% 0.00 ± -1% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru
> 27.68 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one
> 11.09 ± 10% +237.5% 37.44 ± 0% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone
> 12.76 ± 9% -90.9% 1.17 ± 22% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list
> 1.08 ± 19% +338.2% 4.75 ± 7% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add
> 1.81 ± 2% -73.7% 0.48 ± 1% perf-profile.cycles-pp.page_check_address_transhuge.page_referenced_one.rmap_walk_file.rmap_walk.page_referenced
> 3.24 ± 1% -42.5% 1.87 ± 2% perf-profile.cycles-pp.page_referenced.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
> 2.20 ± 2% -66.0% 0.75 ± 5% perf-profile.cycles-pp.page_referenced_one.rmap_walk_file.rmap_walk.page_referenced.shrink_page_list
> 1.54 ± 14% -100.0% 0.00 ± -1% perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.filemap_fault
> 0.00 ± -1% +Inf% 5.57 ± 5% perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.mpage_readpages
> 2.07 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.radix_tree_next_chunk.filemap_map_pages.handle_pte_fault.handle_mm_fault.__do_page_fault
> 3.01 ± 6% -31.7% 2.05 ± 9% perf-profile.cycles-pp.ret_from_fork
> 0.72 ± 67% -98.5% 0.01 ± 94% perf-profile.cycles-pp.return_from_SYSCALL_64.__libc_fork
> 3.15 ± 1% -46.0% 1.70 ± 1% perf-profile.cycles-pp.rmap_walk.page_referenced.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 3.02 ± 2% -48.4% 1.56 ± 1% perf-profile.cycles-pp.rmap_walk_file.rmap_walk.page_referenced.shrink_page_list.shrink_inactive_list
> 29.08 ± 3% -97.0% 0.89 ± 19% perf-profile.cycles-pp.scan_shadow_nodes.shrink_slab.shrink_zone.do_try_to_free_pages.try_to_free_pages
> 28.89 ± 3% -97.1% 0.84 ± 22% perf-profile.cycles-pp.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one.scan_shadow_nodes.shrink_slab
> 44.93 ± 4% +21.9% 54.77 ± 1% perf-profile.cycles-pp.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages.try_to_free_pages
> 33.07 ± 4% -50.8% 16.28 ± 3% perf-profile.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
> 1.11 ± 16% -22.6% 0.86 ± 6% perf-profile.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone.kswapd
> 29.15 ± 3% -96.8% 0.94 ± 16% perf-profile.cycles-pp.shrink_slab.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask
> 73.07 ± 1% -23.5% 55.91 ± 1% perf-profile.cycles-pp.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current
> 45.01 ± 4% +22.1% 54.95 ± 1% perf-profile.cycles-pp.shrink_zone_memcg.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask
> 2.35 ± 14% +78.9% 4.21 ± 13% perf-profile.cycles-pp.smp_call_function_many.native_flush_tlb_others.try_to_unmap_flush.shrink_page_list.shrink_inactive_list
> 0.00 ± -1% +Inf% 55.91 ± 1% perf-profile.cycles-pp.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead
> 72.76 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault
> 2.38 ± 14% +79.5% 4.28 ± 13% perf-profile.cycles-pp.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
> 0.58 ± 1% +51.9% 0.88 ± 14% perf-profile.cycles-pp.workingset_eviction.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
> 0.00 ± -1% +Inf% 96.89 ± 0% perf-profile.cycles-pp.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault
> 91.02 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault
> 0.00 ± -1% +Inf% 1.11 ± 0% perf-profile.cycles-pp.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
> 5.26 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_readpage.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
> 0.00 ± -1% +Inf% 18.21 ± 1% perf-profile.cycles-pp.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault
>
>
>
> lkp-hsw01: Grantley Haswell-EP
> Memory: 64G
>
>
>
>
> vm-scalability.time.user_time
>
> 140 ++--------------------------------------------------------------------+
> |******* ****** *************************************** **** ****** ***
> 120 *+ * * * * * * * * |
> | |
> 100 ++ |
> | |
> 80 ++ |
> | |
> 60 ++ |
> | |
> 40 ++ |
> | |
> 20 OOOO O O OOO O OO OO OO |
> | O OOOOOO OOO O OOOO OO OO |
> 0 ++--------------------------------------------------------------------+
>
>
> vm-scalability.time.major_page_faults
>
> 7e+08 ++*-***-----*-----**--*-----*---****-*--*-***------------*---***--*-+
> ** * * ***** ****** ** ********* * **** * ***************** ******
> 6e+08 ++ |
> | |
> 5e+08 ++ |
> | |
> 4e+08 ++ |
> | |
> 3e+08 ++ |
> | |
> 2e+08 ++ |
> | |
> 1e+08 ++ |
> OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO |
> 0 ++------------------------------------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
> To reproduce:
>
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp run job.yaml
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> Thanks,
> Xiaolong Ye