Re: [REGRESSION][BISECTED][X86] next-20080526 hangs on boot

From: Sitsofe Wheeler
Date: Mon May 26 2008 - 15:37:52 EST


<posted & mailed>

Sitsofe Wheeler wrote:

> Cyrill Gorcunov wrote:
>
>> [Sitsofe Wheeler - Mon, May 26, 2008 at 03:04:54PM +0100]
>> | When using a 32 bit linux-next-20080526 the bootup process will hang at
>> | a random point (not even sysrq helps) with no additional output on the
>> | screen (whereas linux-next-20080523 did boot). Mysteriously, booting
>> | with nmi_watchdog=2 allows the boot to finish (booting with
>> | nmi_watchdog=1 still stalls). I have bisected it down to commit
>> | [d1b946b97d71423f365fa797d1428e1847c0bec1]:
>>
>> Hi, so it helps by reverting only that commit? I mean all further commits
>> are still appiled?
>
> Ah that I hadn't tested. I believe I might need to revert
> 4b82b277707a39b97271439c475f186f63ec4692 too if later commits are applied
> (but I'm still testing)
>
>> and, btw, could you post your config, please?
>
> http://sucs.org/~sits/test/config-20080526.txt

OK applying the following patch (which is more or less a revert of
[4b82b277707a39b97271439c475f186f63ec4692]) resolves the problem:

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index d99ee8a..c55519c 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -480,8 +480,12 @@ int proc_nmi_enabled(struct ctl_table *table, int write, struct file *file,
return -EIO;
}

- /* if nmi_watchdog is not set yet, then set it */
- nmi_watchdog_default();
+ if (nmi_watchdog == NMI_DEFAULT) {
+ if (lapic_watchdog_ok())
+ nmi_watchdog = NMI_LOCAL_APIC;
+ else
+ nmi_watchdog = NMI_IO_APIC;
+ }

if (nmi_watchdog == NMI_LOCAL_APIC) {
if (nmi_watchdog_enabled)
diff --git a/include/asm-x86/nmi.h b/include/asm-x86/nmi.h
index 1e8f34d..7cd5b6a 100644
--- a/include/asm-x86/nmi.h
+++ b/include/asm-x86/nmi.h
@@ -38,9 +38,11 @@ static inline void unset_nmi_pm_callback(struct pm_dev *dev)

#ifdef CONFIG_X86_64
extern void default_do_nmi(struct pt_regs *);
+extern void nmi_watchdog_default(void);
+#else
+#define nmi_watchdog_default() do {} while (0)
#endif

-extern void nmi_watchdog_default(void);
extern void die_nmi(char *str, struct pt_regs *regs, int do_panic);
extern int check_nmi_watchdog(void);
extern int nmi_watchdog_enabled;

The removal of extern void nmi_watchdog_default(void) and the inclusion
of #define nmi_watchdog_default() do {} while (0) look suspicious (why
would nmi_watchdog_default() need to be an infinite loop on 32 bit
systems?).

--
Sitsofe | http://sucs.org/~sits/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/