Re: [PATCH RESEND] x86/irq: assign vectors from numa_node

From: Jesper Juhl
Date: Thu Dec 09 2010 - 18:48:13 EST


On Thu, 9 Dec 2010, Arthur Kepner wrote:

>
> (Resending with expanded cc list.)
>
> Several drivers (e.g., mlx4_core) do something similar to:
>
> err = pci_enable_msix(pdev, entries, num_possible_cpus());
>
> which takes us down this code path:
>
> pci_enable_msix
> native_setup_msi_irqs
> create_irq_nr
> __assign_irq_vector
>
> __assign_irq_vector() preferentially uses vectors from low-numbered
> CPUs. On a system with a large number (>256) CPUs this can result in
> a CPU running out of vectors, and subsequent attempts to assign an
> interrupt to that CPU will fail.
>
> The following patch prefers vectors from the node associated with the
> device (if the device is associated with a node). This should make it
> far less likely that a single CPU's vectors will be exhausted.
>

I'm not going to pretend that I know this code *at all*, but what you
wrote made me think, and I want to share my thoughts. Perhaps they are
useful, perhaps not.

Assigning to the CPU associated with a device sounds sane and sounds like
it will distribute things more. So far so good. But I can't help wondering
if it wouldn't be sane (besides doing this) to simply fall back to the
next higher CPU if the chosen one is exhausted (or wrap around to the
first one if we are already at the highest one)... So that we'll only fail
completely if *all* CPU's are exhausted..?


--
Jesper Juhl <jj@xxxxxxxxxxxxx> http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/