Re: [PATCH 3/4] x86: Remove __current_clr_polling() from mwait_idle()

From: Frederic Weisbecker
Date: Thu Nov 16 2023 - 13:48:59 EST


Le Thu, Nov 16, 2023 at 04:13:16PM +0100, Peter Zijlstra a écrit :
> On Wed, Nov 15, 2023 at 10:13:24AM -0500, Frederic Weisbecker wrote:
> > mwait_idle() is only ever called through by cpuidle, either from
> > default_idle_call() or from cpuidle_enter(). In any case
> > cpuidle_idle_call() sets again TIF_NR_POLLING after calling it so there
> > is no point for this atomic operation upon idle exit.
> >
> > Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx>
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > ---
> > arch/x86/kernel/process.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index b6f4e8399fca..fc7a38084606 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -930,7 +930,6 @@ static __cpuidle void mwait_idle(void)
> > raw_local_irq_disable();
> > }
> > }
> > - __current_clr_polling();
> > }
> >
> > void select_idle_routine(const struct cpuinfo_x86 *c)
>
>
> Urgh at this and the next one... That is, yes we can do this, but it
> makes these function asymmetric and doesn't actually solve the
> underlying problem that all of the polling stuff is inside-out.
>
> Idle loop sets polling, then clears polling because it assumes all
> arch/driver idle loops are non-polling, then individual drivers re-set
> polling, and to be symmetric (above) clear it again, for the generic
> code to set it again, only to clear it again when leaving idle.
>
> Follow that? ;-)

That's right :-)

>
> Anyway, drivers ought to tell up-front if they're polling and then we
> can avoid the whole dance and everything is better.
>
> Something like the very crude below.

Yeah that makes perfect sense (can I use your SoB right away?)

Though I sometimes wonder why we even bother with setting TIF_NR_POLLING
for some short parts in the generic idle loop even on !mwait and
!cpuidle-state-polling states.

Like for example why do we bother with setting TIF_NR_POLLING for just
the portion in the generic idle loop that looks up the cpuidle state
and stops the tick then clear TIF_NR_POLLING before calling wfi on ARM?

Or may be it's a frequent pattern to have a remote wake up happening while
entering the idle loop?

Thanks.