Re: [PATCH] x86: Fix i386 nmi_watchdog that does not trigger die_nmi

From: Andi Kleen
Date: Mon Mar 06 2006 - 22:44:17 EST


GOTO Masanori <gotom@xxxxxxxxxx> writes:

> It fixes i386 nmi_watchdog that does not meet watchdog timeout
> condition. It does not hit die_nmi when it should be triggered,
> because the current nmi_watchdog_tick in arch/i386/kernel/nmi.c never
> count up alert_counter like this:
>
> void nmi_watchdog_tick (struct pt_regs * regs) {
> if (last_irq_sums[cpu] == sum) {
> alert_counter[cpu]++; <- count up alert_counter, but
> if (alert_counter[cpu] == 5*nmi_hz)
> die_nmi(regs, "NMI Watchdog detected LOCKUP");
> alert_counter[cpu] = 0; <- reset alert_counter
>
> This patch changes it back to the previous and working version.
> Tested with 2.6.15. It's also OK for 2.6.16-rc5.
>
> This was found and originally written by Kohta NAKASHIMA.

Oops. Looks quite bad. Real 2.6.16 candidate I guess.

-Andi

>
> -- gotom
>
> Signed-Off-By: GOTO Masanori <gotom@xxxxxxxxxx>
> ---
>
> --- linux-2.6.15/arch/i386/kernel/nmi.c.gotom 2006-03-02 17:52:49.021365056 +0900
> +++ linux-2.6.15/arch/i386/kernel/nmi.c 2006-03-02 17:53:19.939664760 +0900
> @@ -544,7 +544,7 @@ void nmi_watchdog_tick (struct pt_regs *
> * die_nmi will return ONLY if NOTIFY_STOP happens..
> */
> die_nmi(regs, "NMI Watchdog detected LOCKUP");
> -
> + } else {
> last_irq_sums[cpu] = sum;
> alert_counter[cpu] = 0;
> }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/