Re: [PATCH] Revert "genirq/affinity: assign vectors to all possible CPUs"

From: Greg Kroah-Hartman
Date: Mon Oct 15 2018 - 09:21:49 EST


On Mon, Oct 15, 2018 at 02:17:11PM +0200, Paul Menzel wrote:
> Dear Greg, dear Linux folks,
>
>
> On 10/01/18 17:59, Paul Menzel wrote:
>
> > On 10/01/18 14:43, Paul Menzel wrote:
> >
> >> On 10/01/18 14:35, Christoph Hellwig wrote:
> >>> On Mon, Oct 01, 2018 at 02:33:07PM +0200, Paul Menzel wrote:
> >>>> Date: Wed, 29 Aug 2018 17:28:45 +0200
> >>>>
> >>>> This reverts commit ef86f3a72adb8a7931f67335560740a7ad696d1d.
> >>>
> >>> This seems rather odd. If at all you'd revert the patch adding the
> >>> PCI_IRQ_AFFINITY to aacraid, not core infrastructure.
> >>
> >> Thank you for the suggestion, but that flag was added in 2016
> >> to the aacraid driver.
> >>
> >>> commit 0910d8bbdd99856af1394d3d8830955abdefee4a
> >>> Author: Hannes Reinecke <hare@xxxxxxx>
> >>> Date: Tue Nov 8 08:11:30 2016 +0100
> >>>
> >>> scsi: aacraid: switch to pci_alloc_irq_vectors
> >>>
> >>> Use pci_alloc_irq_vectors and drop the hand-crafted interrupt affinity
> >>> routines.
> >>
> >> So what would happen, if `PCI_IRQ_AFFINITY` was removed? Will the
> >> system still work with the same performance?
> >>
> >> As far as I understood, the no regression policy is there for
> >> exactly that reason, and it shouldnât matter if itâs core
> >> infrastructure or not. As written, I have no idea, and just know
> >> reverting the commit in question fixes the problem here. So Iâll
> >> gladly test other solutions to fix this issue.
> >
> > Just as another datapoint, with `PCI_IRQ_AFFINITY` removed from
> > `drivers/scsi/aacraid/comminit.c` in Linux 4.14.73, the driver
> > initializes correctly. I have no idea regarding the performance.
>
> This commit has not been picked up yet. I guess, you are busy, but
> in case there are still objections, itâd be great if the two
> questions below were answered.
>
> 1. What bug is fixed in the LTS series by backporting the commit
> causing the regression?

I can't remember anymore, but unwinding this mess is going to be a pain :(

> 2. Why does the *no regression* policy *not* apply in this case?

It does, but also we are following the "stick to what mainline does",
and the fact that this is not showing up in mainline seems just to be a
lucky accident at the moment. My real worry is that suddenly you are
going to have problems there and that this is just the early-warning
system happening...

thanks,

greg k-h