Re: NMI problems with Dell SMP Xeons

From: Keith Owens
Date: Wed Jun 07 2006 - 00:53:09 EST


Following a suggestion by Brendan Trotter, I ran some more tests to
track down the problem with sending NMI IPI on Dell Xeons.

BIOS Logical OS ACPI Cpus IPI 2 NMI IPI
Processor BIOS OS (APIC_DM_NMI)

Enabled Enabled 4 4 Not delivered Delivered as NMI
Enabled Disabled 4 2 Machine reset Machine reset
Disabled Enabled 2 2 Not delivered Delivered as NMI
Disabled Disabled 2 2 Not delivered Delivered as NMI

So the killer combination with this motherboard is when the BIOS knows
about logical processors but the OS does not. Sending IPI 2 or NMI IPI
with that combination kills the machine. Brendan suggested that the
BIOS is seeing the broadcast NMI on the logical processors which are
not under OS control and that the BIOS cannot cope.

Should we change the x86_64 send_IPI_allbutself() so it is only
delivered to cpus that the OS knows about, instead of doing a general
broadcast. That would prevent offline or hidden cpus being sent an
interrupt that they are not expecting. The failing case is
__send_IPI_shortcut, with a cfg of 0xc0c00.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/