Re: [PATCH] irq: revert non-working patch to affinity defaults

From: Ingo Molnar
Date: Fri Apr 03 2015 - 02:56:20 EST



* Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx> wrote:

> I've seen a couple of reports of issues since commit e2e64a932556 ("genirq:
> Set initial affinity in irq_set_affinity_hint()") where the
> affinity for the interrupt when programmed via
> /proc/irq/<nnn>/smp_affinity will not be able to stick. It changes back
> to some previous value at the next interrupt on that IRQ.
>
> The original intent was to fix the broken default behavior of all IRQs
> for a device starting up on CPU0. With a network card with 64 or more
> queues, all 64 queue's interrupt vectors end up on CPU0 which can have
> bad side effects, and has to be fixed by the irqbalance daemon, or by
> the user at every boot with some kind of affinity script.
>
> The symptom is that after a driver calls set_irq_affinity_hint, the
> affinity will be set for that interrupt (and readable via /proc/...),
> but on the first irq for that vector, the affinity for CPU0 or CPU1
> resets to the default. The rest of the irq affinites seem to work and
> everything is fine.
>
> Impact if we don't fix this for 4.0.0:
> Some users won't be able to set irq affinity as expected, on
> some cpus.
>
> I've spent a chunk of time trying to debug this with no luck and suggest
> that we revert the change if no-one else can help me debug what is going
> wrong, we can pick up the change later.
>
> This commit would also revert commit 4fe7ffb7e17ca ("genirq: Fix null pointer
> reference in irq_set_affinity_hint()") which was a bug fix to the original
> patch.

So the original commit also has the problem that it unnecessary
drops/retakes the descriptor lock:

> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> - __irq_set_affinity(irq, m, false);


i.e. why not just call into irq_set_affinity_locked() while we still
have the descriptor locked?

Now this is just a small annoyance that should not really matter - it
would be nice to figure out the real reason for why the irqs move back
to CPU#0.

In theory the same could happen to 'irqbalanced' as well, if it calls
shortly after an irq was registered - so this is not a bug we want to
ignore.

Also, worst case we are back to where v3.19 was, right? So could we
try to analyze this a bit more?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/