Re: [PATCH v10a] timers: Move marking timer bases idle into tick_nohz_stop_tick()

From: Frederic Weisbecker
Date: Tue Feb 20 2024 - 07:34:50 EST


Le Tue, Feb 20, 2024 at 01:02:18PM +0100, Anna-Maria Behnsen a écrit :
> Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
>
> > Le Tue, Feb 20, 2024 at 11:48:19AM +0100, Anna-Maria Behnsen a écrit :
> >> Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
> >> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> >> index 01fb50c1b17e..b93f0e6f273f 100644
> >> --- a/kernel/time/tick-sched.c
> >> +++ b/kernel/time/tick-sched.c
> >> @@ -895,21 +895,6 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
> >> /* Make sure we won't be trying to stop it twice in a row. */
> >> ts->timer_expires_base = 0;
> >>
> >> - /*
> >> - * If this CPU is the one which updates jiffies, then give up
> >> - * the assignment and let it be taken by the CPU which runs
> >> - * the tick timer next, which might be this CPU as well. If we
> >> - * don't drop this here, the jiffies might be stale and
> >> - * do_timer() never gets invoked. Keep track of the fact that it
> >> - * was the one which had the do_timer() duty last.
> >> - */
> >> - if (cpu == tick_do_timer_cpu) {
> >> - tick_do_timer_cpu = TICK_DO_TIMER_NONE;
> >> - ts->do_timer_last = 1;
> >> - } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
> >> - ts->do_timer_last = 0;
> >> - }
> >> -
> >> /* Skip reprogram of event if it's not changed */
> >> if (ts->tick_stopped && (expires == ts->next_tick)) {
> >> /* Sanity check: make sure clockevent is actually programmed */
> >
> > That should work but then you lose the optimization that resets
> > ts->do_timer_last even if the next timer hasn't changed.
> >
>
> Beside of this optimization thing, I see onther problem. But I'm not
> sure, if I understood it correctly: When the CPU drops the
> tick_do_timer_cpu assignment and stops the tick, it is possible, that
> this CPU nevertheless executes tick_sched_do_timer() and then reassigns
> to tick_do_timer_cpu?

Yes but in this case a timer interrupt has executed and ts->next_tick
is cleared, so the above skip reprogramm branch is not taken.

Thanks.

>
> Then it is mandatory that we have this drop the assignment also in the
> path when the tick is already stopped. Otherwise the problem described
> in the comment could happen with stale jiffies, no?
>
> Thanks
>
> > Thanks.
> >
> >
> >
> >> @@ -938,6 +923,21 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
> >> trace_tick_stop(1, TICK_DEP_MASK_NONE);
> >> }
> >>
> >> + /*
> >> + * If this CPU is the one which updates jiffies, then give up
> >> + * the assignment and let it be taken by the CPU which runs
> >> + * the tick timer next, which might be this CPU as well. If we
> >> + * don't drop this here, the jiffies might be stale and
> >> + * do_timer() never gets invoked. Keep track of the fact that it
> >> + * was the one which had the do_timer() duty last.
> >> + */
> >> + if (cpu == tick_do_timer_cpu) {
> >> + tick_do_timer_cpu = TICK_DO_TIMER_NONE;
> >> + ts->do_timer_last = 1;
> >> + } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
> >> + ts->do_timer_last = 0;
> >> + }
> >> +
> >> ts->next_tick = expires;
> >>
> >> /*