Re: PROBLEM: Scheduling latency not respected in CFS with HRTICK enabled

From: Wanpeng Li
Date: Sat May 07 2016 - 20:19:42 EST


2016-05-06 23:43 GMT+08:00 Tomasz Grabiec <tgrabiec@xxxxxxxxxxxx>:
> Hi,
>
> System details:
>
> Kernel: 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26
> UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> CPU: 2 sockets, 8 cores per socket, 2 threads per core
> CPU Model: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
>
> I was trying to reduce scheduling delays below 1/HZ, which is 4ms
> for our kernel. I enabled HRTICK scheduler feature and set
> scheduler tunables to:
>
> sched_latency_ns: 1000000
> sched_min_granularity_ns: 400000
> sched_wakeup_granularity_ns: 400000
> sched_autogroup_enabled: 0
> sched_tunable_scaling: 1
> sched_cfs_bandwidth_slice_us: 5000
> sched_child_runs_first: 0
> sched_migration_cost_ns: 500000
> sched_nr_migrate: 32
> sched_rr_timeslice_ms: 25
> sched_rt_period_us: 1000000
> sched_rt_runtime_us: 950000
> sched_shares_window_ns: 1000000
> sched_time_avg_ms: 1000
>
> Note that scheduling latency is set to 1ms and min_granularity to 0.4ms.
>
> The test case I used was to run two processes competing for CPU
> pinned to a single core, like this:
>
> python -c "import math; math.factorial(100000000)" &
> pid1=$!
> python -c "import math; math.factorial(100000000)" &
> pid2=$!
> sleep 1
> sudo taskset -cap 2 $pid1
> sudo taskset -cap 2 $pid2
>
> Using perf I've found that scheduling delays do not stay around
> 0.5ms as could be expected from scheduling latency, but switch
> between modes of 0.5ms and 4ms back and forth.
>
> For example (perf script):
>
> python 147208 [002] 3272129.329011: sched:sched_stat_wait:
> comm=python pid=147207 delay=504875 [ns]
> python 147207 [002] 3272129.329515: sched:sched_stat_wait:
> comm=python pid=147208 delay=504096 [ns]
> python 147208 [002] 3272129.330019: sched:sched_stat_wait:
> comm=python pid=147207 delay=503936 [ns]
> python 147207 [002] 3272129.330523: sched:sched_stat_wait:
> comm=python pid=147208 delay=503888 [ns]
> python 147208 [002] 3272129.331067: sched:sched_stat_wait:
> comm=python pid=147207 delay=543902 [ns]
> python 147207 [002] 3272129.334047: sched:sched_stat_wait:
> comm=python pid=147208 delay=2980114 [ns]
> python 147208 [002] 3272129.338048: sched:sched_stat_wait:
> comm=python pid=147207 delay=4000695 [ns]
> python 147207 [002] 3272129.342051: sched:sched_stat_wait:
> comm=python pid=147208 delay=4002851 [ns]
> python 147208 [002] 3272129.346055: sched:sched_stat_wait:
> comm=python pid=147207 delay=4004720 [ns]
> python 147207 [002] 3272129.350058: sched:sched_stat_wait:
> comm=python pid=147208 delay=4002403 [ns]
> python 147208 [002] 3272129.354064: sched:sched_stat_wait:
> comm=python pid=147207 delay=4006037 [ns]
> python 147207 [002] 3272129.358064: sched:sched_stat_wait:
> comm=python pid=147208 delay=3999730 [ns]
>
> More detailed trace around such transition point looks like this:
>
> perf record \
> -e sched:sched_stat_runtime \
> -e sched:sched_stat_wait \
> -e sched:sched_switch \
> -e probe:hrtick \
> -e probe:hrtick_start \
> -e probe:scheduler_tick -C 2
>
> python 873 [002] 3273334.927836: probe:hrtick: (ffffffff8109b430)
> python 873 [002] 3273334.927837: sched:sched_stat_runtime:
> comm=python pid=873 runtime=502915 [ns] vruntime=1549788940847739 [ns]
> python 873 [002] 3273334.927837: sched:sched_stat_runtime:
> comm=python pid=873 runtime=880 [ns] vruntime=1549788940848619 [ns]
> python 873 [002] 3273334.927838: sched:sched_stat_wait:
> comm=python pid=874 delay=503795 [ns]
> python 873 [002] 3273334.927838: probe:hrtick_start: (ffffffff8109f4f0)
> python 873 [002] 3273334.927839: sched:sched_switch:
> prev_comm=python prev_pid=873 prev_prio=120 prev_state=R ==>
> next_comm=python next_pid=874 next_prio=120
> python 874 [002] 3273334.928233: probe:scheduler_tick: (ffffffff810a1f30)
> python 874 [002] 3273334.928233: sched:sched_stat_runtime:
> comm=python pid=874 runtime=395992 [ns] vruntime=1549788941220502 [ns]
> python 874 [002] 3273334.928236: sched:sched_stat_runtime:
> comm=python pid=874 runtime=2278 [ns] vruntime=1549788941222780 [ns]
> python 874 [002] 3273334.928238: sched:sched_stat_wait:
> comm=kworker/2:0 pid=93149 delay=0 [ns]
> python 874 [002] 3273334.928238: probe:hrtick_start: (ffffffff8109f4f0)
> python 874 [002] 3273334.928239: sched:sched_switch:
> prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==>
> next_comm=kworker/2:0 next_pid=93149 next_prio=120
> kworker/2:0 93149 [002] 3273334.928242: sched:sched_stat_runtime:
> comm=kworker/2:0 pid=93149 runtime=5827 [ns] vruntime=1549788940354446
> [ns]
> kworker/2:0 93149 [002] 3273334.928242: probe:hrtick_start: (ffffffff8109f4f0)
> kworker/2:0 93149 [002] 3273334.928243: sched:sched_stat_wait:
> comm=python pid=874 delay=5827 [ns]
> kworker/2:0 93149 [002] 3273334.928244: probe:hrtick_start: (ffffffff8109f4f0)
> kworker/2:0 93149 [002] 3273334.928244: sched:sched_switch:
> prev_comm=kworker/2:0 prev_pid=93149 prev_prio=120 prev_state=S ==>
> next_comm=python next_pid=874 next_prio=120
> python 874 [002] 3273334.928745: probe:hrtick: (ffffffff8109b430)
> python 874 [002] 3273334.928745: sched:sched_stat_runtime:
> comm=python pid=874 runtime=503888 [ns] vruntime=1549788941726668 [ns]
> python 874 [002] 3273334.928746: sched:sched_stat_runtime:
> comm=python pid=874 runtime=910 [ns] vruntime=1549788941727578 [ns]
> python 874 [002] 3273334.928747: sched:sched_stat_wait:
> comm=python pid=873 delay=908895 [ns]
> python 874 [002] 3273334.928747: probe:hrtick_start: (ffffffff8109f4f0)
> python 874 [002] 3273334.928748: sched:sched_switch:
> prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==>
> next_comm=python next_pid=873 next_prio=120
> python 873 [002] 3273334.929249: probe:hrtick: (ffffffff8109b430)
> python 873 [002] 3273334.929249: sched:sched_stat_runtime:
> comm=python pid=873 runtime=502850 [ns] vruntime=1549788941351469 [ns]
> python 873 [002] 3273334.929250: sched:sched_stat_runtime:
> comm=python pid=873 runtime=918 [ns] vruntime=1549788941352387 [ns]
> python 873 [002] 3273334.932236: probe:scheduler_tick: (ffffffff810a1f30)
> python 873 [002] 3273334.932237: sched:sched_stat_runtime:
> comm=python pid=873 runtime=2986981 [ns] vruntime=1549788944339368
> [ns]
> python 873 [002] 3273334.932239: sched:sched_stat_runtime:
> comm=python pid=873 runtime=1685 [ns] vruntime=1549788944341053 [ns]
> python 873 [002] 3273334.932239: sched:sched_stat_wait:
> comm=python pid=874 delay=3492434 [ns]
> python 873 [002] 3273334.932240: probe:hrtick_start: (ffffffff8109f4f0)
> python 873 [002] 3273334.932240: sched:sched_switch:
> prev_comm=python prev_pid=873 prev_prio=120 prev_state=R ==>
> next_comm=python next_pid=874 next_prio=120
> python 874 [002] 3273334.932741: probe:hrtick: (ffffffff8109b430)
> python 874 [002] 3273334.932742: sched:sched_stat_runtime:
> comm=python pid=874 runtime=503174 [ns] vruntime=1549788942230752 [ns]
> python 874 [002] 3273334.932743: sched:sched_stat_runtime:
> comm=python pid=874 runtime=948 [ns] vruntime=1549788942231700 [ns]
> python 874 [002] 3273334.936240: probe:scheduler_tick: (ffffffff810a1f30)
> python 874 [002] 3273334.936241: sched:sched_stat_runtime:
> comm=python pid=874 runtime=3497530 [ns] vruntime=1549788945729230
> [ns]
> python 874 [002] 3273334.936249: sched:sched_stat_runtime:
> comm=python pid=874 runtime=8553 [ns] vruntime=1549788945737783 [ns]
> python 874 [002] 3273334.936250: sched:sched_stat_wait:
> comm=python pid=873 delay=4010205 [ns]
> python 874 [002] 3273334.936250: probe:hrtick_start: (ffffffff8109f4f0)
> python 874 [002] 3273334.936251: sched:sched_switch:
> prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==>
> next_comm=python next_pid=873 next_prio=120
>
>
> Please advise, is this a bug?
>

Could you cat /sys/kernel/debug/sched_features?

Regards,
Wanpeng Li