Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")

From: Peter Zijlstra
Date: Thu Apr 26 2018 - 07:48:56 EST


On Thu, Apr 26, 2018 at 12:31:33PM +0200, Vincent Guittot wrote:
> From: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.

Ah quite so indeed. This could result in us running idle even though
there is a runnable task around -- which is bad.

> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@xxxxxxxxx>
> Reported-by: Niklas Söderlund <niklas.soderlund@xxxxxxxxxxxx>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (curr_cost > this_rq->max_idle_balance_cost)
> this_rq->max_idle_balance_cost = curr_cost;
>
> +out:
> /*
> * While browsing the domains, we released the rq lock, a task could
> * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (this_rq->cfs.h_nr_running && !pulled_task)
> pulled_task = 1;
>
> -out:
> /* Move the next balance forward */
> if (time_after(this_rq->next_balance, next_balance))
> this_rq->next_balance = next_balance;