Re: [PATCH 05/10] cpuidle: Comment about timers requirements VS idle handler

From: Rafael J. Wysocki
Date: Fri Aug 11 2023 - 13:38:55 EST


On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>

Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx>

> ---
> kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 342f58a329f5..d52f6e3e3854 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -258,6 +258,36 @@ static void do_idle(void)
> while (!need_resched()) {
> rmb();
>
> + /*
> + * Interrupts shouldn't be re-enabled from that point on until
> + * the CPU sleeping instruction is reached. Otherwise an interrupt
> + * may fire and queue a timer that would be ignored until the CPU
> + * wakes from the sleeping instruction. And testing need_resched()
> + * doesn't tell about pending needed timer reprogram.
> + *
> + * Several cases to consider:
> + *
> + * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as
> + * "wfi" or "mwait" are fine because they can be entered with
> + * interrupt disabled.
> + *
> + * - sti;mwait() couple is fine because the interrupts are
> + * re-enabled only upon the execution of mwait, leaving no gap
> + * in-between.
> + *
> + * - ROLLBACK based idle handlers with the sleeping instruction
> + * called with interrupts enabled are NOT fine. In this scheme
> + * when the interrupt detects it has interrupted an idle handler,
> + * it rolls back to its beginning which performs the
> + * need_resched() check before re-executing the sleeping
> + * instruction. This can leak a pending needed timer reprogram.
> + * If such a scheme is really mandatory due to the lack of an
> + * appropriate CPU sleeping instruction, then a FAST-FORWARD
> + * must instead be applied: when the interrupt detects it has
> + * interrupted an idle handler, it must resume to the end of
> + * this idle handler so that the generic idle loop is iterated
> + * again to reprogram the tick.
> + */
> local_irq_disable();
>
> if (cpu_is_offline(cpu)) {
> --
> 2.34.1
>