Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector

From: Ming Lei
Date: Mon Jan 15 2018 - 20:35:05 EST


On Mon, Jan 15, 2018 at 06:43:47PM +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> > These two patches fixes IO hang issue reported by Laurence.
> >
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, then this vector
> > can't handle irq any more.
> >
> > The 1st patch moves irq vectors spread into one function, and prepares
> > for the fix done in 2nd patch.
> >
> > The 2nd patch fixes the issue by trying to make sure online CPUs assigned
> > to irq vector.
>
> Which means it's completely undoing the intent and mechanism of managed
> interrupts. Not going to happen.

As I replied in previous mail, some of offline CPUs may be assigned to
some of irq vectors after we assign vectors to all possible CPUs, some
of which are not present.

>
> Which driver is that which abuses managed interrupts and does not keep its
> queues properly sorted on cpu hotplug?

It isn't related with driver/device, and I can trigger this issue on NVMe
easily except for HPSA.


Thanks,
Ming