Re: [RFT for v9] (Was Re: [PATCH v8 -tip 00/26] Core scheduling)

From: Ning, Hongyu
Date: Fri Nov 13 2020 - 04:23:08 EST


On 2020/11/7 4:55, Joel Fernandes wrote:
> All,
>
> I am getting ready to send the next v9 series based on tip/master
> branch. Could you please give the below tree a try and report any results in
> your testing?
> git tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
> git log:
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/log/?h=coresched
>
> The major changes in this series are the improvements:
> (1)
> "sched: Make snapshotting of min_vruntime more CGroup-friendly"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=9a20a6652b3c50fd51faa829f7947004239a04eb
>
> (2)
> "sched: Simplify the core pick loop for optimized case"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=0370117b4fd418cdaaa6b1489bfc14f305691152
>
> And a bug fix:
> (1)
> "sched: Enqueue task into core queue only after vruntime is updated"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=401dad5536e7e05d1299d0864e6fc5072029f492
>
> There are also 2 more bug fixes that I squashed-in related to kernel
> protection and a crash seen on the tip/master branch.
>
> Hoping to send the series next week out to the list.
>
> Have a great weekend, and Thanks!
>
> - Joel
>
>
> On Mon, Oct 19, 2020 at 09:43:10PM -0400, Joel Fernandes (Google) wrote:

Adding 4 workloads test results for core scheduling v9 candidate:

- kernel under test:
-- coresched community v9 candidate from https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
-- latest commit: 2e8591a330ff (HEAD -> coresched, origin/coresched) NEW: sched: Add a coresched command line option
-- coresched=on kernel parameter applied
- workloads:
-- A. sysbench cpu (192 threads) + sysbench cpu (192 threads)
-- B. sysbench cpu (192 threads) + sysbench mysql (192 threads, mysqld forced into the same cgroup)
-- C. uperf netperf.xml (192 threads over TCP or UDP protocol separately)
-- D. will-it-scale context_switch via pipe (192 threads)
- test machine setup:
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 2
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4
- test results, no obvious performance drop compared to community v8 build:
-- workload A:
+----------------------+------+----------------------+------------------------+
| | ** | sysbench cpu * 192 | sysbench cpu * 192 |
+======================+======+======================+========================+
| cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_cpu_1 |
+----------------------+------+----------------------+------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+----------------------+------+----------------------+------------------------+
| coresched_normalized | ** | 0.98 | 1.01 |
+----------------------+------+----------------------+------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+----------------------+------------------------+
| smtoff_normalized | ** | 0.59 | 0.6 |
+----------------------+------+----------------------+------------------------+

-- workload B:
+----------------------+------+----------------------+------------------------+
| | ** | sysbench cpu * 192 | sysbench mysql * 192 |
+======================+======+======================+========================+
| cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_mysql_0 |
+----------------------+------+----------------------+------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+----------------------+------+----------------------+------------------------+
| coresched_normalized | ** | 1.02 | 0.78 |
+----------------------+------+----------------------+------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+----------------------+------------------------+
| smtoff_normalized | ** | 0.59 | 0.75 |
+----------------------+------+----------------------+------------------------+

-- workload C:
+----------------------+------+---------------------------+---------------------------+
| | ** | uperf netperf TCP * 192 | uperf netperf UDP * 192 |
+======================+======+===========================+===========================+
| cgroup | ** | cg_uperf | cg_uperf |
+----------------------+------+---------------------------+---------------------------+
| record_item | ** | Tput_avg (Gb/s) | Tput_avg (Gb/s) |
+----------------------+------+---------------------------+---------------------------+
| coresched_normalized | ** | 0.65 | 0.67 |
+----------------------+------+---------------------------+---------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+---------------------------+---------------------------+
| smtoff_normalized | ** | 0.83 | 0.91 |
+----------------------+------+---------------------------+---------------------------+

-- workload D:
+----------------------+------+-------------------------------+
| | ** | will-it-scale * 192 |
| | | (pipe based context_switch) |
+======================+======+===============================+
| cgroup | ** | cg_will-it-scale |
+----------------------+------+-------------------------------+
| record_item | ** | threads_avg |
+----------------------+------+-------------------------------+
| coresched_normalized | ** | 0.29 |
+----------------------+------+-------------------------------+
| default_normalized | ** | 1.00 |
+----------------------+------+-------------------------------+
| smtoff_normalized | ** | 0.87 |
+----------------------+------+-------------------------------+

- notes on test results record_item:
* coresched_normalized: smton, cs enabled, test result normalized by default value
* default_normalized: smton, cs disabled, test result normalized by default value
* smtoff_normalized: smtoff, test result normalized by default value


Hongyu