Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

From: Don Zickus
Date: Mon Jun 26 2017 - 16:20:03 EST


On Fri, Jun 23, 2017 at 11:50:25PM +0200, Thomas Gleixner wrote:
> On Fri, 23 Jun 2017, Don Zickus wrote:
> > Hmm, all this work for a temp fix. Kan, how much longer until the real fix
> > of having perf count the right cycles?
>
> Quite a while. The approach is wilfully breaking the user space ABI, which
> is not going to happen.
>
> And there is a simpler solution as well, as I said here:
>
> http://lkml.kernel.org/r/alpine.DEB.2.20.1706221730520.1885@nanos

Hi Thomas,

So, you are saying instead of slowing down the perf counter, speed up the
hrtimer to sample more frequently like so:

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 03e0b69..8ff49de 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -160,7 +160,7 @@ static void set_sample_period(void)
* and hard thresholds) to increment before the
* hardlockup detector generates a warning
*/
- sample_period = get_softlockup_thresh() * ((u64)NSEC_PER_SEC / 5);
+ sample_period = get_softlockup_thresh() * ((u64)NSEC_PER_SEC / 10);
}

/* Commands for resetting the watchdog */


That is another way of doing it. It just hits all the arches. It does seem
cleaner as the watchdog_thresh value still retains it correct meaning. Are
the laptop folks going to yell at me some more for waking their systems up
more? :-)

Cheers,
Don