Re: 2.6.20.4: NETDEV WATCHDOG and lockups

From: Christian Kujau
Date: Fri Apr 06 2007 - 14:19:59 EST


On Wed, 4 Apr 2007, Christian Kujau wrote:
Maybe it's a real locking problem. Here are some more
suggestions for testing (if you don't find anything better):
- try without SMP, so: 'acpi=off lapic nosmp'

We were able to have our hosting provider to replace the 8139too with a E100, the onboard r8169 stayed of course. After this, the box came back fine...only to lock up again shortly after :(

So again we spoke to our hosting provider and they just took out the 2 SATA disks and put them in a completely new system: amd64 dualcore again, 2 GB ram, r8169 onboard NIC, e100 pci-slot NIC. Now booting 2.6.20.4 and even 2.6.18-4-k7 (the debian kernel) with IOAPIC eabled seems to work, meaning the box is up since yesterday evening and interrupts are shared. Not equally, but still:

# cat /proc/interrupts
CPU0 CPU1
0: 111 0 IO-APIC-edge timer
1: 7 9 IO-APIC-edge i8042
4: 260 1 IO-APIC-edge serial
6: 0 3 IO-APIC-edge floppy
8: 0 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-fasteoi acpi
16: 157 575579 IO-APIC-fasteoi eth0
17: 3812553 1 IO-APIC-fasteoi eth1
19: 100651 8262484 IO-APIC-fasteoi libata
NMI: 0 0
LOC: 17272991 17266237
ERR: 0
MIS: 0

While this is a good thing, we now have different problems: our 2nd sata drive is not usable any more, but we again we doubt hardware problems, because this disk has been replaced already back in the old box...

but yes, this seem to be different problems, for the curious among you I've put details here: http://nerdbynature.de/bits/2.6.20.4/db2/

Thanks to all who have replied,
Christian.
--
BOFH excuse #209:

Only people with names beginning with 'A' are getting mail this week (a la Microsoft)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/