Re: NMI problems with Dell SMP Xeons

From: Andi Kleen
Date: Wed Jun 07 2006 - 03:20:09 EST


On Wednesday 07 June 2006 06:49, Keith Owens wrote:
> Following a suggestion by Brendan Trotter, I ran some more tests to
> track down the problem with sending NMI IPI on Dell Xeons.
>
> BIOS Logical OS ACPI Cpus IPI 2 NMI IPI
> Processor BIOS OS (APIC_DM_NMI)
>
> Enabled Enabled 4 4 Not delivered Delivered as NMI
> Enabled Disabled 4 2 Machine reset Machine reset
> Disabled Enabled 2 2 Not delivered Delivered as NMI
> Disabled Disabled 2 2 Not delivered Delivered as NMI
>
> So the killer combination with this motherboard is when the BIOS knows
> about logical processors but the OS does not. Sending IPI 2 or NMI IPI
> with that combination kills the machine. Brendan suggested that the
> BIOS is seeing the broadcast NMI on the logical processors which are
> not under OS control and that the BIOS cannot cope.

How did you manage that? Normally the OS should use all CPUs
known to BIOS. Or did you boot with special boot options to limit it?

> Should we change the x86_64 send_IPI_allbutself() so it is only
> delivered to cpus that the OS knows about, instead of doing a general
> broadcast.

Hmm, we should be doing that already to avoid races for CPU hotplug. But
maybe it's not working correctly for KDB. Does it go away when you
enable CPU hotplug? Anyways, should be a SMOP to force it. I wouldn't
have a problem to use sequence ipis always and get rid of the broadcasts.
There were benchmarks at some point and there wasn't a noticeable
difference.

-Andi


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/