Re: [RFC PATCH v4] sched: Fix performance regression introduced by mm_cid

From: Peter Zijlstra
Date: Wed Apr 12 2023 - 05:11:06 EST


On Tue, Apr 11, 2023 at 09:12:21PM +0800, Aaron Lu wrote:

> Forget about this "v4 is better than v2 and v3" part, my later test
> showed the contention can also rise to around 18% for v4.

So while I can reproduce the initial regression on a HSW-EX system
(4*18*2) and get lovely things like:

34.47%--schedule_hrtimeout_range_clock
schedule
|
--34.42%--__schedule
|
|--31.86%--_raw_spin_lock
| |
| --31.65%--native_queued_spin_lock_slowpath
|
--0.72%--dequeue_task_fair
|
--0.60%--dequeue_entity

On a --threads=144 run; it is completely gone when I use v4:

6.92%--__schedule
|
|--2.16%--dequeue_task_fair
| |
| --1.69%--dequeue_entity
| |
| |--0.61%--update_load_avg
| |
| --0.54%--update_curr
|
|--1.30%--pick_next_task_fair
| |
| --0.54%--set_next_entity
|
|--0.77%--psi_task_switch
|
--0.69%--switch_mm_irqs_off


:-(