Re: [PATCH] sched/core: Cache timer busy housekeeping target

From: Wanpeng Li
Date: Mon Jul 08 2019 - 22:17:34 EST


On Tue, 9 Jul 2019 at 09:39, Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> On Mon, Jul 01, 2019 at 08:24:37PM +0800, Wanpeng Li wrote:
> > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> > index 41dfff2..0d49bef 100644
> > --- a/kernel/time/hrtimer.c
> > +++ b/kernel/time/hrtimer.c
> > @@ -195,8 +195,10 @@ struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base,
> > int pinned)
> > {
> > #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
> > - if (static_branch_likely(&timers_migration_enabled) && !pinned)
> > - return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> > + if (static_branch_likely(&timers_migration_enabled) && !pinned) {
> > + base->last_target_cpu = get_nohz_timer_target(base->last_target_cpu);
> > + return &per_cpu(hrtimer_bases, base->last_target_cpu);
>
>
> I'm not sure this is exactly what we intend to cache here.
>
> This doesn't return the last CPU for a given timer
> (that would be timer->flags & TIMER_CPUMASK) but the last CPU that
> was returned when "base" was passed.
>
> First of all, it's always initialized to CPU 0, which is perhaps
> not exactly what we want.
>
> Also the result can be very stale and awkward. If for some reason we have:
>
> base(CPU 5)->last_target_cpu = 255
>
> then later a timer is enqueued to CPU 5, the next time we re-enqueue that
> timer will be to CPU 255, then the second re-enqueue will be to whatever
> value we have in base(CPU 255)->last_target_cpu, etc...
>
> For example imagine that:
>
> base(CPU 255)->last_target_cpu = 5
>
> the timer will bounce between those two very distant CPU 5 and 255. So I think
> you rather want "timer->flags & TIMER_CPUMASK". But note that those flags
> can also be initialized to zero and therefore CPU 0, while we actually want
> the CPU of the timer enqueuer for a first use. And I can't think of a
> simple solution to solve that :-( Perhaps keeping the enqueuer CPU as the
> first choice (as we do upstream) is still the best thing we have.

Got it, thanks for pointing out this.

Wanpeng