Re: [PATCH v2] sched: fix clear NOHZ_BALANCE_KICK

From: Peter Zijlstra
Date: Wed Jun 05 2013 - 06:44:30 EST


On Wed, Jun 05, 2013 at 10:13:11AM +0200, Vincent Guittot wrote:
> I have faced a sequence where the Idle Load Balance was sometime not
> triggered for a while on my platform.
>
> CPU 0 and CPU 1 are running tasks and CPU 2 is idle
>
> CPU 1 kicks the Idle Load Balance
> CPU 1 selects CPU 2 as the new Idle Load Balancer
> CPU 2 sets NOHZ_BALANCE_KICK for CPU 2
> CPU 2 sends a reschedule IPI to CPU 2
> While CPU 3 wakes up, CPU 0 or CPU 1 migrates a waking up task A on CPU 2
> CPU 2 finally wakes up, runs task A and discards the Idle Load Balance
> task A quickly goes back to sleep (before a tick occurs on CPU 2)
> CPU 2 goes back to idle with NOHZ_BALANCE_KICK set
>
> Whenever CPU 2 will be selected as the ILB, no reschedule IPI will be sent
> because NOHZ_BALANCE_KICK is already set and no Idle Load Balance will be
> performed.
>
> We must wait for the sched softirq to be raised on CPU 2 thanks to another
> part the kernel to come back to clear NOHZ_BALANCE_KICK.
>
> The proposed solution clears NOHZ_BALANCE_KICK in schedule_ipi if
> we can't raise the sched_softirq for the Idle Load Balance.
>
> Change since V1:
> - move the clear of NOHZ_BALANCE_KICK in got_nohz_idle_kick if the ILB
> can't run on this CPU (as suggested by Peter)
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/