Re: MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs

From: Ramon Fried
Date: Tue Jan 14 2020 - 16:40:43 EST


On Tue, Jan 14, 2020 at 11:38 PM Ramon Fried <rfried.dev@xxxxxxxxx> wrote:
>
> On Tue, Jan 14, 2020 at 2:15 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > Ramon Fried <rfried.dev@xxxxxxxxx> writes:
> > > While debugging the root cause of spurious IRQ's on my PCIe MSI line it appears
> > > that because of the line:
> > > info->chip->flags |= IRQCHIP_ONESHOT_SAFE;
> > > in pci_msi_create_irq_domain()
> > >
> > > The IRQF_ONESHOT is ignored, especially when requesting IRQ through
> > > pci_request_threaded_irq() where handler is NULL.
> >
> > Which is perfectly fine.
> >
> > > The problem is that the MSI masking now only surrounds the HW handler,
> > > and all additional MSI that occur before the threaded handler is
> > > complete are considered by the note_interrupt() as spurious.
> >
> > Which is not a problem as long as the thread finishes before 100k MSIs
> > arrived on that line. If that happens then there is something really
> > wrong. Either the device fires MSIs like crazy or the threaded handler
> > is stuck somewhere.
> >
> > > Besides the side effect of that, I don't really understand the logic
> > > of not masking the MSI until the threaded handler is complete,
> > > especially when there's no HW handler and only threaded handler.
> >
> > What's wrong with having another interrupt firing while the threaded
> > handler is running? Nothing, really. It actually can be desired because
> > the threaded handler is allowed to sleep.
> What do you mean, isn't it the purpose IRQ masking ?
> Interrupt coalescing is done to mitigate these IRQ's, these HW
> interrupts just consume
> CPU cycles and don't do anything useful (scheduling an already
> scheduled thread).
Additionally, in this case there isn't even an HW IRQ handler, it's
passed as NULL in the request IRQ function in this scenario.
> Thanks,
> Ramon.
> >
> > Thanks,
> >
> > tglx