Re: [tip:sched/eevdf] [sched/fair] e0c2ff903c: phoronix-test-suite.blogbench.Write.final_score -34.8% regression

From: Chen Yu
Date: Thu Aug 17 2023 - 21:55:58 EST


On 2023-08-14 at 14:49:14 +0200, Peter Zijlstra wrote:
> On Fri, Aug 11, 2023 at 09:11:21AM +0800, Chen Yu wrote:
> > On 2023-08-10 at 21:24:37 +0800, kernel test robot wrote:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a -34.8% regression of phoronix-test-suite.blogbench.Write.final_score on:
> > >
> > >
> > > commit: e0c2ff903c320d3fd3c2c604dc401b3b7c0a1d13 ("sched/fair: Remove sched_feat(START_DEBIT)")
> > > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/eevdf
> > >
> > > testcase: phoronix-test-suite
> > > test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
> > > parameters:
> > >
> > > test: blogbench-1.1.0
> > > option_a: Write
> > > cpufreq_governor: performance
> > >
>
> Is this benchmark fork() heavy?
>

It is not fork() heavy. After created the threads, it becomes a
loop to write to some files.

> > It seems that commit e0c2ff903c32 removed the sched_feat(START_DEBIT) for initial
> > task, but also increases the vruntime for non-initial task:
> > Before the e0c2ff903c32, the vruntime for a enqueued task is:
> > cfs_rq->min_vruntime
> > After the e0c2ff903c32, the vruntime for a enqueued task is:
> > avg_vruntime(cfs_rq) = \Sum v_i * w_i / W
> > = \Sum v_i / nr_tasks
> > which is usually higher than cfs_rq->min_vruntime, and we give less sleep bonus to
> > the wakee, which could bring more or less impact to different workloads.
> > But since later we switched to lag based placement, this new vruntime will minus
> > lag, which could mitigate this problem.
>
> Right.. but given this problem was bisected through the lag based
> placement to this commit, I wondered about fork() / pthread_create().
>
> If this is indeed fork()/pthread_create() heavy, could you please see if
> disabling PLACE_DEADLINE_INITIAL helps?

Tested with PLACE_DEADLINE_INITIAL disabled, no much difference is observed.

The baseline is Commit 246c6d7ab4d0 ("sched/eevdf: Better handle mixed slice length")

PLACE_DEADLINE_I NO_PLACE_DEADLINE_INITIAL
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
4166 -4.7% 3969 phoronix-test-suite.blogbench.Write.final_score
330.88 +4.4% 345.49 phoronix-test-suite.time.elapsed_time
330.88 +4.4% 345.49 phoronix-test-suite.time.elapsed_time.max
150672 -0.0% 150640 phoronix-test-suite.time.file_system_inputs
29947344 -2.2% 29277840 phoronix-test-suite.time.file_system_outputs
1954038 -0.3% 1947949 phoronix-test-suite.time.involuntary_context_switches
163.00 +1.2% 165.00 phoronix-test-suite.time.major_page_faults
32256 +0.7% 32472 phoronix-test-suite.time.maximum_resident_set_size
152607 -1.1% 150874 phoronix-test-suite.time.minor_page_faults
4096 +0.0% 4096 phoronix-test-suite.time.page_size
8169 -5.0% 7764 phoronix-test-suite.time.percent_of_cpu_this_job_got
26616 -0.9% 26374 phoronix-test-suite.time.system_time
416.59 +8.3% 450.98 phoronix-test-suite.time.user_time
1764497 -0.8% 1749992 phoronix-test-suite.time.voluntary_context_switches


blogbench.Write.final_score on different commits in eevdf branch:

sched/fair: Add cfs_rq::avg_vruntime
5217

sched/fair: Remove sched_feat(START_DEBIT)
3223

sched/fair: Add lag based placement
2736

sched/fair: Implement an EEVDF-like scheduling policy
3942

sched/fair: Commit to EEVDF
3957

sched/eevdf: Better handle mixed slice length
3836


It seems that, "Remove sched_feat(START_DEBIT)" brings some impact
and "Implement an EEVDF-like scheduling policy" restores some
throughput. The score from "sched/fair: Add lag based placement"
might not be reliable that, in place_entity() it scales the vlag
based on se->load.weight directly, while
"Implement an EEVDF-like scheduling policy" fixes that by using
scale_load_down().

I'll check what RUN_TO_PARITY brings to blogbench.

thanks,
Chenyu