Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS

From: Gautham R. Shenoy
Date: Wed Jun 21 2023 - 04:17:49 EST


Hello David,

On Tue, Jun 20, 2023 at 03:08:22PM -0500, David Vernet wrote:
> On Mon, Jun 19, 2023 at 11:43:13AM +0530, Gautham R. Shenoy wrote:
> > Hello David,
> >
> >
> > On Tue, Jun 13, 2023 at 12:20:04AM -0500, David Vernet wrote:
> > [..snip..]
> >
> > > +static void swqueue_enqueue(struct rq *rq, struct task_struct *p, int enq_flags)
> > > +{
> > > + unsigned long flags;
> > > + struct swqueue *swqueue;
> > > + bool task_migrated = enq_flags & ENQUEUE_MIGRATED;
> > > + bool task_wakeup = enq_flags & ENQUEUE_WAKEUP;
> > > +
> > > + /*
> > > + * Only enqueue the task in the shared wakequeue if:
> > > + *
> > > + * - SWQUEUE is enabled
> > > + * - The task is on the wakeup path
> > > + * - The task wasn't purposefully migrated to the current rq by
> > > + * select_task_rq()
> > > + * - The task isn't pinned to a specific CPU
> > > + */
> > > + if (!task_wakeup || task_migrated || p->nr_cpus_allowed == 1)
> > > + return;
> >
> > In select_task_rq_fair(), having determined if the target of task
> > wakeup should be the task's previous CPU vs the waker's current CPU,
> > we spend quite a bit of time already to determine if there is an idle
> > core/CPU in the target's LLC. @rq would correspond to CPU chosen as a
> > result of that scan or if no idle CPU exists, @rq corresponds to the
> > target CPU determined by wake_affine_idle()/wake_affine_weight().
> >
> > So if the CPU of @rq is idle here, can we not simply return here?
>
> Hi Gautum,
>
> Sorry, I'm not sure I'm quite following the issue you're pointing out.
> We don't use swqueue if the task was migrated following
> select_task_rq_fair(). That's the idea with us returning if the task was
> migrated (the second conditional in that if). If I messed up that logic
> somehow, it should be fixed.

Sorry, my bad. I see it now.

So as per this patch, the only time we enqueue the task on the shared
wakeup is if the target of try_to_wake_up() is the same CPU where the
task ran previously.

When wake_affine logic fails and the previous CPU is chosen as the
target, and when there are no other idle cores/threads in the LLC of
the previous CPU, it makes sense to queue the task on the
shared-wakequeue instead of on a busy previous CPU.

And when that previous CPU is idle, the try_to_wake_up() would have
woken it up via ttwu_queue(), so before going idle the next time it
will check the shared queue for the task and find it. We should be
good in this case.

Now, it is possible that select_task_rq_fair() ended up selecting the
waker's CPU as the target based on the
wake_affine_idle()/wake_affine_weight() logic. And if there is no idle
core/thread on the waker's LLC, the target would be the busy waker
CPU. In the case when the waker CPU is different from the task's
previous CPU, due to ENQUEUE_MIGRATE flag being set, the task won't be
queued on the shared wakequeue and instead has to wait on the busy
waker CPU.

I wonder if it makes sense to enqueue the task on the shared wakequeue
in this scenario as well.

>
> > Or if the idea is to avoid the scan for an idle core/CPU in
> > select_task_rq_fair(), then
>
> No, swqueue_enqueue() is called from enqueue_task_fair(), not
> select_task_rq_fair(). If select_task_rq_fair() (which is of course
> called beforehand for a waking task) found an idle core/CPU, we don't
> bother using swqueue, as mentioned above.

Got it. Thanks!

>
> Thanks,
> David

--
Thanks and Regards
gautham.