Re: 2.6.20.4: NETDEV WATCHDOG and lockups

From: Christian Kujau
Date: Wed Apr 04 2007 - 09:12:28 EST


On Tue, 3 Apr 2007, Francois Romieu wrote:
Christian Kujau <evil@xxxxxxxxxx> :
If the apic voodoo makes no difference, you can:
1 - leave it enabled

Well, we tried to boot with ACPI compiled in again, but disabled during boot:

- acpi=off lapic, crashed after 1h (almost exactly) of service
- acpi=off lapic, crashed again, this time after 4h (almost exactly)
- acpi=off noapic, still running, now 21h.

The 2nd node has been booted with 'noapic' and ACPI *not* compiled in and is now running for 2,5 days. However, interrupts are not shared between cores.

This means we still have to test booting with 'lapic' and ACPI enabled. Unfortnately there are a few more sub-options to choose from:

- acpi=force -- enable ACPI if default was off
- acpi=noirq -- do not use ACPI for IRQ routing
- acpi=ht -- run only enough ACPI to enable Hyper Threading
- acpi=strict -- Be less tolerant of platforms that are
not strictly ACPI specification compliant.

2 - check that netconsole is not used with the 8139 (it is busted)
3 - check that netconsole is not used at all

Actually I was thinking about *using* netconsole, since even setting up remote (userspace-)syslog left nothing on the syslog-server, when the machine crashed. But if it's b0rked in 8139, I will refrain from doing so.

4 - try:
http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.21-rc5/r8169-20070402

Are they in -rc5 yet or 'not in -rc5 but should be applied to -rc5'?

Thanks for your time,
Christian.
--
BOFH excuse #235:

The new frame relay network hasn't bedded down the software loop transmitter yet.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/