Re: [PATCH v10a] timers: Move marking timer bases idle into tick_nohz_stop_tick()

From: Anna-Maria Behnsen
Date: Tue Feb 20 2024 - 07:02:36 EST


Frederic Weisbecker <frederic@xxxxxxxxxx> writes:

> Le Tue, Feb 20, 2024 at 11:48:19AM +0100, Anna-Maria Behnsen a écrit :
>> Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> index 01fb50c1b17e..b93f0e6f273f 100644
>> --- a/kernel/time/tick-sched.c
>> +++ b/kernel/time/tick-sched.c
>> @@ -895,21 +895,6 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
>> /* Make sure we won't be trying to stop it twice in a row. */
>> ts->timer_expires_base = 0;
>>
>> - /*
>> - * If this CPU is the one which updates jiffies, then give up
>> - * the assignment and let it be taken by the CPU which runs
>> - * the tick timer next, which might be this CPU as well. If we
>> - * don't drop this here, the jiffies might be stale and
>> - * do_timer() never gets invoked. Keep track of the fact that it
>> - * was the one which had the do_timer() duty last.
>> - */
>> - if (cpu == tick_do_timer_cpu) {
>> - tick_do_timer_cpu = TICK_DO_TIMER_NONE;
>> - ts->do_timer_last = 1;
>> - } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
>> - ts->do_timer_last = 0;
>> - }
>> -
>> /* Skip reprogram of event if it's not changed */
>> if (ts->tick_stopped && (expires == ts->next_tick)) {
>> /* Sanity check: make sure clockevent is actually programmed */
>
> That should work but then you lose the optimization that resets
> ts->do_timer_last even if the next timer hasn't changed.
>

Beside of this optimization thing, I see onther problem. But I'm not
sure, if I understood it correctly: When the CPU drops the
tick_do_timer_cpu assignment and stops the tick, it is possible, that
this CPU nevertheless executes tick_sched_do_timer() and then reassigns
to tick_do_timer_cpu?

Then it is mandatory that we have this drop the assignment also in the
path when the tick is already stopped. Otherwise the problem described
in the comment could happen with stale jiffies, no?

Thanks

> Thanks.
>
>
>
>> @@ -938,6 +923,21 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
>> trace_tick_stop(1, TICK_DEP_MASK_NONE);
>> }
>>
>> + /*
>> + * If this CPU is the one which updates jiffies, then give up
>> + * the assignment and let it be taken by the CPU which runs
>> + * the tick timer next, which might be this CPU as well. If we
>> + * don't drop this here, the jiffies might be stale and
>> + * do_timer() never gets invoked. Keep track of the fact that it
>> + * was the one which had the do_timer() duty last.
>> + */
>> + if (cpu == tick_do_timer_cpu) {
>> + tick_do_timer_cpu = TICK_DO_TIMER_NONE;
>> + ts->do_timer_last = 1;
>> + } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
>> + ts->do_timer_last = 0;
>> + }
>> +
>> ts->next_tick = expires;
>>
>> /*