Re: kernel BUG at arch/x86/kernel/io_apic_64.c:357!

From: Eric W. Biederman
Date: Tue Jul 29 2008 - 19:23:27 EST


Mike Travis <travis@xxxxxxx> writes:

> I didn't follow this from the start but one reason why NR_IRQS based on
> NR_CPUS is a bad idea, is the huge (nearly 300Mb) increase in memory usage
> (that's mostly wasted.) I believe there's another patch coming real soon
> now to make irq allocations dynamic. (I had also hoped to look closer at
> your irq abstraction patch you sent a while back. Does that also address
> this issue?)

The patch I sent out earlier is one of the key patches needed for killing
NR_IRQS usage in generic code. Which is part of what we need to make this
dynamic.

In systems where the I/O is well balanced with the compute the typical
usage is usually within 16 irqs per core, and at worst 32. That is an old
rule of thumb observation and that makes for reasonable allocations.

I don't have a problem at all with your code that updated the heuristic to
be based on the NR_IOAPICS.

My problem is with Thomas's patch that totally threw out all of our tuned
heuristics and made NR_IRQS=256. Which is ludicrous. Even on 32bit systems
there are cases where 1024 irq sources needed to be supported.

Which is what NR_IRQ_VECTORS is. I goofed slightly in my comments.
irq_vector only needs to be NR_IRQS in size. I think ACPI still needs
NR_IRQ_VECTORS to know how many GSI the kernel can support. The fact they
are not mapped 1-1 right now in the 32bit kernel is unfortunate.

> But this would be a show stopper for SGI being able to ship systems if the
> distros do not want to waste this much memory and won't set NR_CPUS=4096.

Yes. We absolutely need to dynamically allocate the irq data structures. Then
we can use the irq numbers sparsely and not have problems.

I just have problems with the code setting NR_IRQS at 256 when we have single
potentially common hardware devices talking about having that many irqs on
a single device.

We really need to be able to scale to an unreasonable number of IRQs when we
have the hardware plugged into the system that will use them. Just like
we need to scale to an unreasonable number of cpus when you plug them into
a system.

I expect irqs to actually grow faster then cpus while all of the devices are
learning how to accommodate hardware virtualization. It would not surprise
me in the slightest if I can plug in the right hardware and exceed
NR_CPUS*32 irqs in an sgi machine in the next year or so.

The only problem with NR_IRQS=NR_CPUS*32 is that we pay the price on lower
end machines when we compile to support a higher cpu count.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/