Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx?

From: bill davidsen
Date: Mon Dec 08 2003 - 11:02:15 EST


In article <20031206045409.GK8039@xxxxxxxxxxxxxx>,
William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
| l?r, 06.12.2003 kl. 05.37 skrev William Lee Irwin III:
| >> Yeah, it looks like it hit you too.
| >> Could you boot with noirqbalance on the kernel commandline and see if
| >> the problem goes away?
|
| On Sat, Dec 06, 2003 at 05:48:46AM +0100, Stian Jordet wrote:
| > Wow, that actually fixed it :)
| > CPU0 CPU1
| > 0: 65636 63667 IO-APIC-edge timer
| > 1: 150 136 IO-APIC-edge i8042
| > 2: 0 0 XT-PIC cascade
| > 3: 2 1 IO-APIC-edge serial
| > 8: 3 1 IO-APIC-edge rtc
| > 9: 0 0 IO-APIC-level acpi
| > 14: 18 37 IO-APIC-edge ide0
|
| Okay, irqbalance has gaffed (as predicted). Could you send in
| /proc/cpuinfo and /var/log/dmesg?

I think the most confusing thing about this was the choice of
"noirqbalance" as an option to mean "do balance irqs." I'm not sure that
the default to put all irqs on a single CPU is optimal in any case, but
the naming is particularly bad.

On light irq load the cache probably gets reloaded before the next
interrupt on modern CPUs, and under really heavy irq pressure I see
posts showing some overflow to other CPUs, so it's the in-between cases
which benefit. At least I hope, people did look at cache and ctx effects
before putting irqs on a single CPU, given the people I assume they are
right.

--
bill davidsen <davidsen@xxxxxxx>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/