Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS

From: Peter Zijlstra
Date: Thu Jun 22 2023 - 06:30:10 EST


On Thu, Jun 22, 2023 at 02:41:57PM +0530, Gautham R. Shenoy wrote:

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -215,21 +215,17 @@ static void swqueue_enqueue(struct rq *rq, struct task_struct *p, int enq_flags)
> {
> unsigned long flags;
> struct swqueue *swqueue;
> - bool task_migrated = enq_flags & ENQUEUE_MIGRATED;
> - bool task_wakeup = enq_flags & ENQUEUE_WAKEUP;
>
> /*
> * Only enqueue the task in the shared wakequeue if:
> *
> * - SWQUEUE is enabled
> - * - The task is on the wakeup path
> - * - The task wasn't purposefully migrated to the current rq by
> - * select_task_rq()
> - * - The task isn't pinned to a specific CPU
> + * - select_idle_sibling() didn't find an idle CPU to enqueue this wakee task.
> */
> - if (!task_wakeup || task_migrated || p->nr_cpus_allowed == 1)

You lost the nr_cpus_allowed == 1 case, that actually made sense, except
he didn't don't have a hook in set_cpus_allowed() to fix it back up.

> + if (!READ_ONCE(p->sched_add_to_swq))
> return;

So suppose we fill all CPUs with tasks T_{0..n-n1}, sis finds them a
nice idle cpu and we don't get queued.

Then wake up another n tasks T_{n...2n-1}, none of them find an idle
CPU, as per the above them all taken. These all do get queued.

However, suppose T>=n does a wakeup-preemption, and we end up with T>=n
running while T<n get queued on the rq, but not the 'queue' (I so hate
the swqueue name).

Then one CPU has both it's tasks go 'away' and becomes idle.

At this point the 'queue' contains only running tasks that are not
migratable and all the tasks that could be migrated are not on the queue
-- IOW it is completely useless.

Is this the intention?