Re: [RFC PATCH 39/86] sched: handle lazy resched in set_nr_*_polling()

From: Peter Zijlstra
Date: Wed Nov 08 2023 - 04:16:01 EST


On Tue, Nov 07, 2023 at 01:57:25PM -0800, Ankur Arora wrote:
> To trigger a reschedule on a target runqueue a few things need
> to happen first:
>
> 1. set_tsk_need_resched(target_rq->curr, RESCHED_eager)
> 2. ensure that the target CPU sees the need-resched bit
> 3. preempt_fold_need_resched()
>
> Most of this is done via some combination of: resched_curr(),
> set_nr_if_polling(), and set_nr_and_not_polling().
>
> Update the last two to also handle TIF_NEED_RESCHED_LAZY.
>
> One thing to note is that TIF_NEED_RESCHED_LAZY has run to completion
> semantics, so unlike TIF_NEED_RESCHED, we don't need to ensure that
> the caller sees it, and of course there is no preempt folding.
>
> Originally-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx>
> ---
> kernel/sched/core.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index e2215c417323..01df5ac2982c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -892,14 +892,15 @@ static inline void hrtick_rq_init(struct rq *rq)
>
> #if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
> /*
> - * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
> + * Atomically set TIF_NEED_RESCHED[_LAZY] and test for TIF_POLLING_NRFLAG,
> * this avoids any races wrt polling state changes and thereby avoids
> * spurious IPIs.
> */
> -static inline bool set_nr_and_not_polling(struct task_struct *p)
> +static inline bool set_nr_and_not_polling(struct task_struct *p, resched_t rs)
> {
> struct thread_info *ti = task_thread_info(p);
> - return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
> +
> + return !(fetch_or(&ti->flags, _tif_resched(rs)) & _TIF_POLLING_NRFLAG);
> }

Argh, this it making the whole thing even worse, because now you're
using that eager naming for setting which has the exact opposite meaning
from testing.

> @@ -916,7 +917,7 @@ static bool set_nr_if_polling(struct task_struct *p)
> for (;;) {
> if (!(val & _TIF_POLLING_NRFLAG))
> return false;
> - if (val & _TIF_NEED_RESCHED)
> + if (val & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY))
> return true;
> if (try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED))
> break;

Depending on the exact semantics of LAZY this could be wrong, the
Changeog doesn't clarify.

Changing this in a different patch from resched_curr() makes it
impossible to review :/