Re: [PATCH] sched/core: Minor optimize ttwu_runnable()

From: Peter Zijlstra
Date: Tue Nov 08 2022 - 05:08:50 EST


On Tue, Nov 08, 2022 at 10:11:49AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 07, 2022 at 03:54:38PM +0000, Valentin Schneider wrote:
>
> > So that's the part for the p->sched_class->task_woken() callback, which
> > only affects RT and DL (and only does something when !p->on_cpu). I *think*
> > it's fine to remove it from ttwu_runnable() as any push/pull should have
> > happened when other tasks were enqueued on the same CPU - with that said,
> > it wouldn't hurt to double check this :-)
> >
> >
> > As for the check_preempt_curr(), since per the above p can be preempted,
> > you could have scenarios right now with CFS tasks where
> > ttwu_runnable()->check_preempt_curr() causes NEED_RESCHED to be set.
> >
> > e.g. p0 does
> >
> > set_current_state(TASK_UNINTERRUPTIBLE)
> >
> > but then gets interrupted by the tick, a p1 gets selected to run instead
> > because of check_preempt_tick(), and then runs long enough to have
> > check_preempt_curr() decide to let p0 preempt p1.
> >
> > That does require specific timing (lower tick frequency should make this
> > more likely) and probably task niceness distribution too, but isn't
> > impossible.
> >
> > Maybe try reading p->on_cpu, and only do the quick task state update if
> > it's still the current task, otherwise do the preemption checks?
>
> I'm confused...

I am and Valentin has a point. It could indeed be preempted and in that
case check_preempt_curr() could indeed make it get back on.

In that case his suggestion might make sense; something along the lines
of so I suppose...

(And I think we can still do the reorg I proposed elsewhere, but I've not
yet tried.)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cb2aa2b54c7a..6944d9473295 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3698,9 +3698,16 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)

rq = __task_rq_lock(p, &rf);
if (task_on_rq_queued(p)) {
- /* check_preempt_curr() may use rq clock */
- update_rq_clock(rq);
- ttwu_do_wakeup(rq, p, wake_flags, &rf);
+ if (!p->on_cpu) {
+ /*
+ * When on_rq && !on_cpu the task is preempted, see if
+ * it should preempt whatever is current there now.
+ */
+ update_rq_clock(rq);
+ check_preempt_curr(rq, p, wake_flags);
+ }
+ WRITE_ONCE(p->__state, TASK_RUNNING);
+ trace_sched_wakeup(p);
ret = 1;
}
__task_rq_unlock(rq, &rf);