Re: [PATCH v3] sched: async unthrottling for cfs bandwidth

From: Peter Zijlstra
Date: Fri Nov 18 2022 - 07:47:40 EST


On Wed, Nov 16, 2022 at 04:54:18PM -0800, Josh Don wrote:

> +#ifdef CONFIG_SMP
> +static void __cfsb_csd_unthrottle(void *arg)
> +{
> + struct rq *rq = arg;
> + struct rq_flags rf;
> + struct cfs_rq *cursor, *tmp;
> +
> + rq_lock(rq, &rf);
> +
> + /*
> + * Since we hold rq lock we're safe from concurrent manipulation of
> + * the CSD list. However, this RCU critical section annotates the
> + * fact that we pair with sched_free_group_rcu(), so that we cannot
> + * race with group being freed in the window between removing it
> + * from the list and advancing to the next entry in the list.
> + */
> + rcu_read_lock();

preempt_disable() -- through rq->lock -- also holds off rcu. Strictly
speaking this here is superfluous. But if you want it as an annotation,
that's fine I suppose.

> +
> + list_for_each_entry_safe(cursor, tmp, &rq->cfsb_csd_list,
> + throttled_csd_list) {
> + list_del_init(&cursor->throttled_csd_list);
> +
> + if (cfs_rq_throttled(cursor))
> + unthrottle_cfs_rq(cursor);
> + }
> +
> + rcu_read_unlock();
> +
> + rq_unlock(rq, &rf);
> +}
> +
> +static inline void __unthrottle_cfs_rq_async(struct cfs_rq *cfs_rq)
> +{
> + struct rq *rq = rq_of(cfs_rq);
> +
> + if (rq == this_rq()) {
> + unthrottle_cfs_rq(cfs_rq);
> + return;
> + }

Ideally we'd first queue all the remotes and then process local, but
given how all this is organized that doesn't seem trivial to arrange.

Maybe have this function return false when local and save that cfs_rq in
a local var to process again later, dunno, that might turn messy.

> +
> + /* Already enqueued */
> + if (SCHED_WARN_ON(!list_empty(&cfs_rq->throttled_csd_list)))
> + return;
> +
> + list_add_tail(&cfs_rq->throttled_csd_list, &rq->cfsb_csd_list);
> +
> + smp_call_function_single_async(cpu_of(rq), &rq->cfsb_csd);

Hurmph.. so I was expecting something like:

first = list_empty(&rq->cfsb_csd_list);
list_add_tail(&cfs_rq->throttled_csd_list, &rq->cfsb_csd_list);
if (first)
smp_call_function_single_async(cpu_of(rq), &rq->cfsb_csd);

But I suppose I'm remembering the 'old' version. I don't think it is
broken as written. There's a very narrow window where you'll end up
sending a second IPI for naught, but meh.

> +}

Let me go queue this thing, we can always improve upon matters later.