Re: [PATCH v2 5/7] sched: Implement shared runqueue in CFS

From: David Vernet
Date: Wed Jul 12 2023 - 15:13:26 EST


On Wed, Jul 12, 2023 at 11:30:23AM +0530, Gautham R. Shenoy wrote:
> Hello David,
>
> On Mon, Jul 10, 2023 at 03:03:40PM -0500, David Vernet wrote:
>
> [..snip..]
>
> > ---
>
> > +
> > +static struct task_struct *shared_runq_pop_task(struct rq *rq)
> > +{
> > + unsigned long flags;
> > + struct task_struct *p;
> > + struct shared_runq *shared_runq;
> > +
> > + shared_runq = rq_shared_runq(rq);
> > + if (list_empty(&shared_runq->list))
> > + return NULL;
> > +
> > + spin_lock_irqsave(&shared_runq->lock, flags);
> > + p = list_first_entry_or_null(&shared_runq->list, struct task_struct,
> > + shared_runq_node);
>
>
> Apologies for the bikeshedding comment : Here you are attempting to
> remove the task from the "head", while in shared_runq_push_task below,
> you are adding a task to the tail. Which is the usual queue
> semantics. Then why call them shared_runq_pop_task() and
> shared_runq_push_task() ?
>
> Can we name them __shared_runq_enqueue_task() and
> __shared_runq_pick_next_task() instead ?

Hello Gautham,

So this was previously discussed in [0]. I'm fine with changing the
names if that's others' preferences as well. I think what we have now is
nice in that push and pop are list operations whereas enqueue / dequeue
are scheduler operations, but yeah, push / pop are more-so stack than
queue ops. Using __ to make the list ops "private" is fine with me.

[0]: https://lore.kernel.org/lkml/20230622105841.GH4253@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

> > + if (p && is_cpu_allowed(p, cpu_of(rq)))
> > + list_del_init(&p->shared_runq_node);
> > + else
> > + p = NULL;
> > + spin_unlock_irqrestore(&shared_runq->lock, flags);
> > +
> > + return p;
> > +}
> > +
> > +static void shared_runq_push_task(struct rq *rq, struct task_struct *p)
> > +{
> > + unsigned long flags;
> > + struct shared_runq *shared_runq;
> > +
> > + shared_runq = rq_shared_runq(rq);
> > + spin_lock_irqsave(&shared_runq->lock, flags);
> > + list_add_tail(&p->shared_runq_node, &shared_runq->list);
> > + spin_unlock_irqrestore(&shared_runq->lock, flags);
> > +}
> > +
> > static void shared_runq_enqueue_task(struct rq *rq, struct task_struct *p,
> > int enq_flags)
> > -{}
> > +{
> > + bool task_migrated = enq_flags & ENQUEUE_MIGRATED;
> > + bool task_wakeup = enq_flags & ENQUEUE_WAKEUP;
> > +
> > + /*
> > + * Only enqueue the task in the shared runqueue if:
> > + *
> > + * - SWQUEUE is enabled
> > + * - The task is on the wakeup path
> > + * - The task wasn't purposefully migrated to the current rq by
> > + * select_task_rq()
> > + * - The task isn't pinned to a specific CPU
> > + */
> > + if (!task_wakeup || task_migrated || p->nr_cpus_allowed == 1)
> > + return;
> > +
> > + shared_runq_push_task(rq, p);
> > +}
> >
> > static int shared_runq_pick_next_task(struct rq *rq, struct rq_flags *rf)
> > {
> > - return 0;
> > + struct task_struct *p = NULL;
> > + struct rq *src_rq;
> > + struct rq_flags src_rf;
> > + int ret;
> > +
> > + p = shared_runq_pop_task(rq);
> > + if (!p)
> > + return 0;
> > +
> > + rq_unpin_lock(rq, rf);
> > + raw_spin_rq_unlock(rq);
> > +
> > + src_rq = task_rq_lock(p, &src_rf);
> > +
> > + if (task_on_rq_queued(p) && !task_on_cpu(rq, p)) {
> > + update_rq_clock(src_rq);
> > + src_rq = move_queued_task(src_rq, &src_rf, p, cpu_of(rq));
> > + }
> > +
> > + if (src_rq->cpu != rq->cpu)
> > + ret = 1;
> > + else
> > + ret = -1;
>
>
> So if src_rq->cpu != rq->cpu, then the task has _not_ been moved to
> rq. But you return 1.
>
> While in the else case, since src_rq->cpu == rq->cpu, the task has
> been successfully moved to rq. But you are returning -1,
>
> If newidle_balance() were to interpret this return value as the number
> of tasks pulled, then, shouldn't it be the other way around ?

Yeah, good call. Will incorporate this into v3.

> > +
> > + task_rq_unlock(src_rq, p, &src_rf);
> > +
> > + raw_spin_rq_lock(rq);
> > + rq_repin_lock(rq, rf);
> > +
> > + return ret;
> > }
> >
>
> --
> Thanks and Regards
> gautham.