[patch] fix the softlockup watchdog to actually work

From: Ingo Molnar
Date: Tue Jul 17 2007 - 11:49:58 EST



* Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:

> Ingo Molnar wrote:
> > Subject: softlockup: fix Xen bogosity
> > From: Ingo Molnar <mingo@xxxxxxx>
> >
> > this Xen related commit:
> >
>
> Well, not just Xen. It relates to any virtual environment: kvm,
> lguest, vmi, xen... (Not that they all implement a measure of
> unstolen time.)
>
> How about a more descriptive patch title, along the lines of
> "softlockup watchdog: fix rate limiting"?

uhm, the problem was that it did not work _at all_, not something about
'rate limiting'. Yes, i got quite a bit grumpy when i found this,
because you completely broke the softlockup watchdog via a pretty
intrusive commit and you apparently didnt even do a minimal check
whether its functionality was preserved! Updated patch for Andrew/Linus
and for -stable attached.

Ingo

----------------------------->
Subject: fix the softlockup watchdog to actually work
From: Ingo Molnar <mingo@xxxxxxx>

this Xen related commit:

commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
Author: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue May 8 00:28:02 2007 -0700

Ignore stolen time in the softlockup watchdog

broke the softlockup watchdog to never report any lockups. (!)

print_timestamp defaults to 0, this makes the following condition
always true:

if (print_timestamp < (touch_timestamp + 1) ||

and we'll in essence never report soft lockups.

apparently the functionality of the soft lockup watchdog was never
actually tested with that patch applied ...

[this is -stable material too.]

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/softlockup.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

Index: linux/kernel/softlockup.c
===================================================================
--- linux.orig/kernel/softlockup.c
+++ linux/kernel/softlockup.c
@@ -79,10 +79,11 @@ void softlockup_tick(void)
print_timestamp = per_cpu(print_timestamp, this_cpu);

/* report at most once a second */
- if (print_timestamp < (touch_timestamp + 1) ||
- did_panic ||
- !per_cpu(watchdog_task, this_cpu))
+ if ((print_timestamp >= touch_timestamp &&
+ print_timestamp < (touch_timestamp + 1)) ||
+ did_panic || !per_cpu(watchdog_task, this_cpu)) {
return;
+ }

/* do not print during early bootup: */
if (unlikely(system_state != SYSTEM_RUNNING)) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/