Re: [lkp] [sched/preempt] 103637a5b94: 12.4% fsmark.app_overhead, -1.4% fsmark.files_per_sec

From: Konstantin Khlebnikov
Date: Sat Jul 25 2015 - 12:32:25 EST


I guess that's because of cond_resched_lock() usage in nfs_scan_commit_list().
My patch fixes cond_resched_lock() which never worked since v3.13.

On Sat, Jul 25, 2015 at 9:09 AM, kernel test robot <ying.huang@xxxxxxxxx> wrote:
> FYI, we noticed the below changes on
>
> git://internal_merge_and_test_tree revert-103637a5b947af7e6abb8d19b341acff6fbf1ec3-103637a5b947af7e6abb8d19b341acff6fbf1ec3
> commit 103637a5b947af7e6abb8d19b341acff6fbf1ec3 ("sched/preempt: Fix cond_resched_lock() and cond_resched_softirq()")
>
>
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/iterations/nr_threads/disk/fs/fs2/filesize/test_size/sync_method/nr_directories/nr_files_per_directory:
> nhm4/fsmark/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/1x/32t/1HDD/f2fs/nfsv4/9B/400M/fsyncBeforeClose/16d/256fpd
>
> commit:
> 834b9279b37ad019272ff140497b1e07ab52d124
> 103637a5b947af7e6abb8d19b341acff6fbf1ec3
>
> 834b9279b37ad019 103637a5b947af7e6abb8d19b3
> ---------------- --------------------------
> %stddev %change %stddev
> \ | \
> 10808007 Â 2% +12.4% 12147111 Â 3% fsmark.app_overhead
> 348.95 Â 0% -1.4% 343.95 Â 0% fsmark.files_per_sec
> 292.92 Â 0% +1.8% 298.06 Â 0% fsmark.time.elapsed_time
> 292.92 Â 0% +1.8% 298.06 Â 0% fsmark.time.elapsed_time.max
> 103928 Â 0% +13.9% 118415 Â 0% fsmark.time.involuntary_context_switches
> 468477 Â 0% -2.2% 458135 Â 0% fsmark.time.voluntary_context_switches
> 103928 Â 0% +13.9% 118415 Â 0% time.involuntary_context_switches
> 16609 Â 0% -2.1% 16267 Â 0% vmstat.system.in
> 4527868 Â 2% +25.9% 5698416 Â 2% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_lookup.[nfsv4].nfs4_proc_lookup_common.[nfsv4].nfs4_proc_lookup.[nfsv4].nfs_lookup_revalidate.nfs4_lookup_revalidate.lookup_dcache.__lookup_hash
> 2.317e+09 Â 0% +5.5% 2.444e+09 Â 0% latency_stats.sum.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4]._nfs4_open_and_get_state.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs_atomic_open.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
> 1.548e+09 Â 0% +2.9% 1.593e+09 Â 0% latency_stats.sum.wait_on_page_bit.filemap_fdatawait_range.filemap_write_and_wait_range.nfs4_file_fsync.[nfsv4].vfs_fsync_range.do_fsync.SyS_fsync.entry_SYSCALL_64_fastpath
> 3.399e+08 Â 1% +19.0% 4.043e+08 Â 2% cpuidle.C1-NHM.time
> 82737781 Â 0% +24.8% 1.032e+08 Â 0% cpuidle.C1E-NHM.time
> 141228 Â 0% +23.4% 174263 Â 0% cpuidle.C1E-NHM.usage
> 128964 Â154% +209.1% 398639 Â 26% cpuidle.POLL.time
> 3.47 Â 0% -2.5% 3.38 Â 0% turbostat.%Busy
> 110.25 Â 0% -2.0% 108.00 Â 0% turbostat.Avg_MHz
> 32.94 Â 0% +14.4% 37.69 Â 1% turbostat.CPU%c1
> 32.26 Â 1% -11.3% 28.62 Â 1% turbostat.CPU%c3
> -2795 Â -9% -19.2% -2258 Â -4% sched_debug.cfs_rq[1]:/.spread0
> 3273 Â 4% +29.9% 4250 Â 10% sched_debug.cfs_rq[5]:/.avg->runnable_avg_sum
> 70.25 Â 4% +29.5% 91.00 Â 10% sched_debug.cfs_rq[5]:/.tg_runnable_contrib
> -2687 Â 0% -14.1% -2308 Â -2% sched_debug.cpu#0.nr_uninterruptible
> 333314 Â 74% -46.7% 177695 Â 3% sched_debug.cpu#0.ttwu_count
> 318.25 Â 11% -41.3% 186.75 Â 6% sched_debug.cpu#1.nr_uninterruptible
> 331.50 Â 9% -16.0% 278.50 Â 11% sched_debug.cpu#2.nr_uninterruptible
> 172.25 Â 21% +63.6% 281.75 Â 14% sched_debug.cpu#3.nr_uninterruptible
> 753.75 Â 2% +38.7% 1045 Â 4% sched_debug.cpu#4.nr_uninterruptible
> 839497 Â169% -98.0% 16681 Â 1% sched_debug.cpu#4.ttwu_local
> 392.00 Â 3% -50.6% 193.50 Â 12% sched_debug.cpu#5.nr_uninterruptible
> 12.50 Â 35% +272.0% 46.50 Â 73% sched_debug.cpu#6.cpu_load[0]
> 385.00 Â 7% -53.0% 181.00 Â 8% sched_debug.cpu#6.nr_uninterruptible
> 362.00 Â 2% -53.8% 167.25 Â 15% sched_debug.cpu#7.nr_uninterruptible
>
>
> nhm4: Nehalem
> Memory: 4G
>
>
> fsmark.files_per_sec
>
> 352 *+-*-------------------------*-----*--*-------------------------------+
> 351 ++ *.. .. : : : |
> | . . : : : |
> 350 ++ *.. .* : : : |
> 349 ++ *..*..*..*. * *...*..*..* |
> | |
> 348 ++ |
> 347 ++ |
> 346 ++ |
> O O O O O O O O |
> 345 ++ O O O
> 344 ++ O O O |
> | |
> 343 ++ O O O O O O O O O |
> 342 ++--------------------------------------------------------------------+
>
>
> fsmark.time.elapsed_time
>
> 299 ++--------------------------------------------------------------------+
> | O O O O O O O O O |
> 298 ++ O O O O O O O O O
> 297 O+ O O O |
> | O |
> 296 ++ |
> 295 ++ |
> | |
> 294 ++ *..*.. |
> 293 ++ .. *.. *...*..*.. |
> *.. *...* *..*... .*.. + * |
> 292 ++ .. *. *.. + |
> 291 ++ * * |
> | |
> 290 ++--------------------------------------------------------------------+
>
>
> fsmark.time.elapsed_time.max
>
> 299 ++--------------------------------------------------------------------+
> | O O O O O O O O O |
> 298 ++ O O O O O O O O O
> 297 O+ O O O |
> | O |
> 296 ++ |
> 295 ++ |
> | |
> 294 ++ *..*.. |
> 293 ++ .. *.. *...*..*.. |
> *.. *...* *..*... .*.. + * |
> 292 ++ .. *. *.. + |
> 291 ++ * * |
> | |
> 290 ++--------------------------------------------------------------------+
>
>
> fsmark.time.voluntary_context_switches
>
> 470000 ++--------------------------------*--------------------------------+
> | *..*. *.. .*..*..* |
> 468000 *+.*..*.. .*..*..*.. .. *. |
> 466000 ++ *. .* |
> | *. |
> 464000 ++ |
> | |
> 462000 ++ |
> | |
> 460000 ++ |
> 458000 ++ O O O O O O O O
> | O O O O O O O O |
> 456000 O+ O O O O O O |
> | |
> 454000 ++-----------------------------------------------------------------+
>
>
> fsmark.time.involuntary_context_switches
>
> 120000 ++-----------------------------------------------------------------+
> | O O O O O O
> 118000 O+ O O O O O O O |
> 116000 ++ O O O O O O O O O |
> | |
> 114000 ++ |
> 112000 ++ |
> | |
> 110000 ++ |
> 108000 ++ |
> | |
> 106000 ++ |
> 104000 *+.*..*..*..*.. .*..*...*..*.. .*..* |
> | *..*..*..*. *..*. |
> 102000 ++-----------------------------------------------------------------+
>
>
> time.involuntary_context_switches
>
> 120000 ++-----------------------------------------------------------------+
> | O O O O O O
> 118000 O+ O O O O O O O |
> 116000 ++ O O O O O O O O O |
> | |
> 114000 ++ |
> 112000 ++ |
> | |
> 110000 ++ |
> 108000 ++ |
> | |
> 106000 ++ |
> 104000 *+.*..*..*..*.. .*..*...*..*.. .*..* |
> | *..*..*..*. *..*. |
> 102000 ++-----------------------------------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
> To reproduce:
>
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp run job.yaml
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> Thanks,
> Ying Huang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/