Re: [PATCH 1/2] sched/fair: optimization of update_blocked_averages()

From: Peter Zijlstra
Date: Fri Feb 08 2019 - 11:31:06 EST


On Wed, Feb 06, 2019 at 05:14:21PM +0100, Vincent Guittot wrote:
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -346,6 +346,18 @@ static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
> static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
> {
> if (cfs_rq->on_list) {
> + struct rq *rq = rq_of(cfs_rq);
> +
> + /*
> + * With cfs_rq being unthrottled/throttled during an enqueue,
> + * it can happen the tmp_alone_branch points the a leaf that
> + * we finally want to del. In this case, tmp_alone_branch moves
> + * to the prev element but it will point to rq->leaf_cfs_rq_list
> + * at the end of the enqueue.
> + */
> + if (rq->tmp_alone_branch == &cfs_rq->leaf_cfs_rq_list)
> + rq->tmp_alone_branch = cfs_rq->leaf_cfs_rq_list.prev;
> +
> list_del_rcu(&cfs_rq->leaf_cfs_rq_list);
> cfs_rq->on_list = 0;
> }

So that is:

enqueue_task_fair()
enqueue_entity()
list_add_lead_cfs_rq()
check_enqueue_throttle()
throttle_cfs_rq()
walk_tg_tree_from()
tg_throttle_down()
list_del_leaf_cfs_rq()

Which can try and remove a cfs_rq which we just added.

And because the list is a bottom-up order, and the deletion is a
downward operation, we must go back (prev) in the list.

So far so good I suppose.

> @@ -4449,8 +4465,10 @@ static int tg_throttle_down(struct task_group *tg, void *data)
> struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
>
> /* group is entering throttled state, stop time */
> - if (!cfs_rq->throttle_count)
> + if (!cfs_rq->throttle_count) {
> cfs_rq->throttled_clock_task = rq_clock_task(rq);
> + list_del_leaf_cfs_rq(cfs_rq);
> + }
> cfs_rq->throttle_count++;
>
> return 0;