Re: [PATCH] nohz: add missing handling of clocksource watchdog

From: Thomas Gleixner
Date: Sun Dec 07 2008 - 21:15:00 EST


On Mon, 8 Dec 2008, Bartlomiej Zolnierkiewicz wrote:

> On Monday 08 December 2008, Sergei Shtylyov wrote:
> > Hello.
> >
> > R. J. Wysocki wrote:
> > > On Monday, 8 of December 2008, Bartlomiej Zolnierkiewicz wrote:
> > >
> > >> Fixes "Clocksource tsc unstable (delta = -974982308 ns)" problem.
> > >>
> > >
> > > Where can I find the description of the problem?
> > >
> >
> > Kernel.org bug #10216 (it's not about this issue though).
>
> Looking back at bug #10216 the patch is unlikely to help Lars' system
> since it doesn't use nohz and there are also warnings about unsynced
> TSCs (I just don't know why not for 2.6.27-rc kernels)...
>
> > > Rafael
> > >
> > >
> > >> [ IDE was unlucky to be initialized at the same time that
> > >> clocksource watchdog triggers and was blamed for the issue. ]
> > >>
> >
> > I think it might well have been blamed correctly -- the clocksource
> > watchdog timer gets run every half second.
>
> AFAICS this is a special one for measuring stability of the clocksource
> only -- by comparing the value returned by a given clocksource with the
> reference clocksource. Normal drivers have no bussiness there...
>
> Anyway I forgot to add that it fixes the issue for me under QEMU (on my
> laptop TSC is unstable due to halts in idle) and it could be as well QEMU's
> oddity (although it looks like a legit kernel problem). v2 of the patch
> below (updated patch description).
>
> From: Bartlomiej Zolnierkiewicz <bzolnier@xxxxxxxxx>
> Subject: [PATCH v2] nohz: add missing handling of clocksource watchdog
>
> Fixes "Clocksource tsc unstable (delta = -974982308 ns)" problem
> under QEMU.
>
> Cc: Sergei Shtylyov <sshtylyov@xxxxxxxxxxxxx>
> Cc: Lars Winterfeld <lars.winterfeld@xxxxxxxxxxxxx>
> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@xxxxxxxxx>
> ---
> kernel/time/tick-sched.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Index: b/kernel/time/tick-sched.c
> ===================================================================
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -21,6 +21,7 @@
> #include <linux/sched.h>
> #include <linux/tick.h>
> #include <linux/module.h>
> +#include <linux/clocksource.h>
>
> #include <asm/irq_regs.h>
>
> @@ -153,6 +154,7 @@ void tick_nohz_update_jiffies(void)
> local_irq_restore(flags);
>
> touch_softlockup_watchdog();
> + clocksource_touch_watchdog();

NAK.

If this happens then the watchdog logic did not manage to schedule the
watchdog timer in time. This patch just papers over the real problem.

We do not fix QEMU problems in the kernel, as we would miss real
hardware wreckage that way.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/