Re: [Question] timers: trigger_dyntick_cpu() vs TIMER_DEFERRABLE

From: Valentin Schneider
Date: Mon Jul 25 2022 - 11:01:05 EST


On 25/07/22 12:43, Frederic Weisbecker wrote:
> On Mon, Jul 25, 2022 at 10:32:42AM +0100, Valentin Schneider wrote:
>> From what I grok out of get_nohz_timer_target(), under
>> timers_migration_enabled we should migrate the timer to an non-idle CPU
>> (or at the very least a non-isolated CPU) *before* enqueuing the
>> timer.
>
> That's not always the case. For example TIMER_PINNED timers might have
> to run on a buzy or isolated CPU.
>
> And note that even when (base->cpu == smp_processor_id()) we want to kick
> the current CPU with a self-IPI. This way we force, from IRQ-tail, the
> tick to recalculate the next deadline to fire, considering the new enqueued
> timer callback.
>

Right, tick_irq_exit() & friends... I'm still figuring the different
dependencies down there, but I think I can roughly map the bits of what
you're describing.

>> Without timers_migration_enabled (or if TIMER_PINNED), I don't see
>> anything that could migrate the timer elsewhere, so:
>>
>> Why bother kicking a NOHZ CPU for a deferrable timer if it is the next
>> expiring one? Per the definition:
>>
>> * @TIMER_DEFERRABLE: A deferrable timer will work normally when the
>> * system is busy, but will not cause a CPU to come out of idle just
>> * to service it; instead, the timer will be serviced when the CPU
>> * eventually wakes up with a subsequent non-deferrable timer.
>>
>> I tried to find some discussion over this in LKML, but found nothing.
>> v3 of the patch did *not* kick a CPU for a deferrable timer, but v4 (the
>> one that ended up merged) did (see below). Patch in question is:
>>
>> a683f390b93f ("timers: Forward the wheel clock whenever possible")
>
> Because TIMER_DEFERRABLE timers should only be deferred when the CPU is
> in "nohz-idle". If the CPU runs an actual task with the tick shutdown
> ("nohz-full"), we should execute those deferrable timers.
>

Ah, that makes sense, thank you for highlighting the difference. The
comment *does* say "come out of *idle*", not *tickless*...

> Now that's the theory. In practice the deferrable timers are ignored by
> both nohz-idle and nohz-full when it comes to compute the next nohz delta.
> This is a mistake that is there since the introduction of nohz-full but I've
> always been scared to break some user setup while fixing it. Anyway things
> should look like this (untested):
>

IIUC that's making get_next_timer_interrupt() poke the deferrable base if the
CPU isn't tickless idle (IOW if it is tickless "busy" or ticking
idle). That makes sense from what you've written above, but I get your
apprehension (though AIUI "only" pinned deferrable timers should be
problematic, as the others should be migrated away).


Thanks again for your detailed reply!