Re: [PATCH v2 0/4] sched/fair: Active balancer RT/DL preemption fix

From: Valentin Schneider
Date: Tue Oct 01 2019 - 06:29:49 EST


(expanded the Cc list)
RT/DL folks, any thought on the thing?

On 15/08/2019 15:51, Valentin Schneider wrote:
> Vincent's load balance rework [1] got me thinking about how and where we
> use rq.nr_running vs rq.cfs.h_nr_running checks, and this lead me to
> stare intently at the active load balancer.
>
> I haven't seen it happen (yet), but from reading the code it really looks
> like we can have some scenarios where the cpu_stopper ends up preempting
> a > CFS class task, since we never actually look at what's the remote rq's
> running task.
>
> This series shuffles things around the CFS active load balancer to prevent
> this from happening.
>
> - Patch 1 is a freebie cleanup
> - Patch 2 is a preparatory code move
> - Patch 3 adds h_nr_running checks
> - Patch 4 adds a sched class check + detach_one_task() to the active balance
>
> This is based on top of today's tip/sched/core:
> a46d14eca7b7 ("sched/fair: Use rq_lock/unlock in online_fair_sched_group")
>
> v1 -> v2:
> - (new patch) Added need_active_balance() cleanup
>
> - Tweaked active balance code move to respect existing
> sd->nr_balance_failed modifications
> - Added explicit checks of active_load_balance()'s return value
>
> - Added an h_nr_running < 1 check before kicking the cpu_stopper
>
> - Added a detach_one_task() call in active_load_balance() when the remote
> rq's running task is > CFS
>
> [1]: https://lore.kernel.org/lkml/1564670424-26023-1-git-send-email-vincent.guittot@xxxxxxxxxx/
>
> Valentin Schneider (4):
> sched/fair: Make need_active_balance() return bools
> sched/fair: Move active balance logic to its own function
> sched/fair: Check for CFS tasks before detach_one_task()
> sched/fair: Prevent active LB from preempting higher sched classes
>
> kernel/sched/fair.c | 151 ++++++++++++++++++++++++++++----------------
> 1 file changed, 95 insertions(+), 56 deletions(-)
>
> --
> 2.22.0
>