Re: [tip: timers/core] timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped

From: Frederic Weisbecker
Date: Wed May 03 2023 - 12:36:35 EST


On Mon, Apr 25, 2022 at 12:50:56PM -0000, tip-bot2 for Nicholas Piggin wrote:
> The following commit has been merged into the timers/core branch of tip:
>
> Commit-ID: 62c1256d544747b38e77ca9b5bfe3a26f9592576
> Gitweb: https://git.kernel.org/tip/62c1256d544747b38e77ca9b5bfe3a26f9592576
> Author: Nicholas Piggin <npiggin@xxxxxxxxx>
> AuthorDate: Sat, 23 Apr 2022 00:14:46 +10:00
> Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CommitterDate: Mon, 25 Apr 2022 14:45:22 +02:00
>
> timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped
>
> When tick_nohz_stop_tick() stops the tick and high resolution timers are
> disabled, then the clock event device is not put into ONESHOT_STOPPED
> mode. This can lead to spurious timer interrupts with some clock event
> device drivers that don't shut down entirely after firing.
>
> Eliminate these by putting the device into ONESHOT_STOPPED mode at points
> where it is not being reprogrammed. When there are no timers active, then
> tick_program_event() with KTIME_MAX can be used to stop the device. When
> there is a timer active, the device can be stopped at the next tick (any
> new timer added by timers will reprogram the tick).

I'm confused by the above, why are we handling the timer active part here?

> Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Link: https://lore.kernel.org/r/20220422141446.915024-1-npiggin@xxxxxxxxx
> ---
> kernel/time/tick-sched.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 2d76c91..b1b105d 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -928,6 +928,8 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
> if (unlikely(expires == KTIME_MAX)) {
> if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> hrtimer_cancel(&ts->sched_timer);
> + else
> + tick_program_event(KTIME_MAX, 1);
> return;
> }
>
> @@ -1364,9 +1366,15 @@ static void tick_nohz_handler(struct clock_event_device *dev)
> tick_sched_do_timer(ts, now);
> tick_sched_handle(ts, regs);
>
> - /* No need to reprogram if we are running tickless */
> - if (unlikely(ts->tick_stopped))
> + if (unlikely(ts->tick_stopped)) {
> + /*
> + * The clockevent device is not reprogrammed, so change the
> + * clock event device to ONESHOT_STOPPED to avoid spurious
> + * interrupts on devices which might not be truly one shot.
> + */
> + tick_program_event(KTIME_MAX, 1);

More specifically why are we stopping the tick here entirely and
unconditionally? If the tick is stopped (actually meaning it is delayed
or _might_ be totally stopped), then the next tick is going to be re-evaluated
shortly after:

* On the idle loop if within idle
* On IRQ exit if nohz_full

And then tick_nohz_stop_tick() will be called and stop the tick entirely
if necessary.

Am I missing something else?

Thanks.