Re: apic errors and looping with 2.4, none with 2.2

From: Mikael Pettersson
Date: Thu Mar 25 2004 - 11:42:00 EST


Chris Stromsoe writes:
> I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
> starts throwing APIC errors.
>
> The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
> SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
> LE chipset. I'm booting using lilo with append="noapic".
>
> As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
> CPUs. Varying combinations of:
>
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
...
> After a few hours of uptime, the box stops responding to keyboard input.
> It begins printing the above messages to console over and over. I have
> several other identical machines that I received in the same batch that
> run 2.4 without any problems (though they do seem to require "noapic").
>
> It runs fine with 2.2 and is running 2.2.26 right now.

Like Maciej wrote, the machine's local APIC bus corrupts messages.
The fact that your other supposedly-identical machines don't have
this problem indicates strongly that this particular box has a HW
problem. It could be a weak power-supply, inadequate cooling, a
bad CPU, or a bad motherboard.

I just checked the 2.2.25 kernel and it doesn't seem to enable
APIC_LVTERR. You're problably getting APIC bus errors with 2.2
too, you just won't see them logged anywhere.

(Regarding your other boxes that need "noapic": you have tried
w/o ACPI, right? acpi=off or pci=noacpi.)

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/