Re: Is it ok for deferrable timer wakeup the idle cpu?

From: Viresh Kumar
Date: Mon Feb 03 2014 - 01:51:30 EST


Sorry was away for short vacation.

On 28 January 2014 19:20, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> On Thu, Jan 23, 2014 at 07:50:40PM +0530, Viresh Kumar wrote:
>> Wait, I got the wrong code here. That's wasn't my initial intention.
>> I actually wanted to write something like this:
>>
>> - wake_up_nohz_cpu(cpu);
>> + if (!tbase_get_deferrable(timer->base) || idle_cpu(cpu))
>> + wake_up_nohz_cpu(cpu);
>>
>> Will that work?

Something is seriously wrong with me, again wrote rubbish code.
Let me phrase what I wanted to write :)

"don't send IPI to a idle CPU for a deferrable timer."

Probably I code it correctly this time atleast.

- wake_up_nohz_cpu(cpu);
+ if (!(tbase_get_deferrable(timer->base) && idle_cpu(cpu)))
+ wake_up_nohz_cpu(cpu);

> Well, this is going to wake up the target from its idle state, which is
> what we want to avoid if the timer is deferrable, right?

Yeah, sorry for doing it for second time :(

> The simplest thing we want is:
>
> if (!tbase_get_deferrable(timer->base) || tick_nohz_full_cpu(cpu))
> wake_up_nohz_cpu(cpu);
>
> This spares the IPI for the common case where the timer is deferrable and we run
> in periodic or dynticks-idle mode (which should be 99.99% of the existing workloads).

I wasn't looking at this problem with NO_HZ_FULL in mind. As I thought its
only about if the CPU is idle or not. And so the solution I was
talking about was:

"don't send IPI to a idle CPU for a deferrable timer."

But I see that still failing with the code you wrote. For normal cases where we
don't enable NO_HZ_FULL, we will still end up waking up idle CPUs which
is what Lei Wen reported initially.

Also if a CPU is marked for NO_HZ_FULL and is not idle currently then we
wouldn't send a IPI for a deferrable timer. But we actually need that, so that
we can reevaluate the timers order again?

> Then we can later optimize that and spare the IPI on full dynticks CPUs when they run
> idle, but that require some special care about subtle races which can't be dealt
> with a simple test on "idle_cpu(target)". And power consumption in full dynticks
> is already very suboptimized anyway.
>
> So I suggest we start simple with the above test, and a big fat comment which explains
> what we are doing and what needs to be done in the future.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/