Re: 2.6.25-rc5-git6: Reported regressions from 2.6.24

From: Thomas Gleixner
Date: Sat Mar 22 2008 - 10:42:21 EST


On Sat, 22 Mar 2008, Andi Kleen wrote:
> > CPU0 runs the watchdog timer and schedules it on CPU1.
> >
> > With NO_HZ enabled CPU1 is in a long idle sleep. At this point of the
> > boot process there is probably no timer pending on CPU1, which means
> > the idle sleep is infinite.
> >
> > Now some time later CPU1 gets woken by an interrupt/IPI and runs the
> > timer wheel. At this point the pm_timer which is the reference clock
> > has already wrapped around, so the watchdog thinks that there is a
>
> In my old original own noidletick code I simply limited all sleeps
> to below the wrap around of the primary timer. Wouldn't something
> like that work?

No, it does not solve the real problem of not reevaluating the timer
wheel on the idle CPU when a timer gets added from some other CPU. We
would paper over the watchdog issue, but postponing a timer event,
which was added cross CPU to some artifical expiry time is simply
wrong.

> I'm not sure just doing this for add_timer_on() only is correct.
> After all it could affect any other code not run by add_timer_on()
> couldn't it?

No, it's limited to add_timer_on() simply because no other code can
add a new timer (timer_list or hrtimer) which modifies the next event
on another CPU. There is also the rare case, when one CPU runs the
timer callback and the other one modifies the timer, but that's not
relevant for the NOHZ problem because the CPU which runs the callback
is not idle at this point.

All other timer operations are CPU local and reevaluated before the
CPU goes idle again.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/