Re: [PATCH 1/1] PCI: dwc: Use regular interrupt instead of chained

From: Radu Rendec
Date: Thu Jun 29 2023 - 16:43:10 EST


On Thu, 2023-06-29 at 14:57 -0500, Bjorn Helgaas wrote:
> On Thu, Jun 29, 2023 at 02:30:19PM -0400, Radu Rendec wrote:
> > The DesignWare PCIe host driver uses a chained interrupt to demultiplex
> > the downstream MSI interrupts. On Qualcomm SA8540P Ride, enabling both
> > pcie2a and pcie3a at the same time can create an interrupt storm where
> > the parent interrupt fires continuously, even though reading the PCIe
> > host registers doesn't identify any child MSI interrupt source. This
> > effectively locks up CPU0, which spends all the time servicing these
> > interrupts.
> >
> > This is a clear example of how bypassing the interrupt core by using
> > chained interrupts can be very dangerous if the hardware misbehaves.
> >
> > Convert the driver to use a regular interrupt for the demultiplex
> > handler. This allows the interrupt storm detector to detect the faulty
> > interrupt and disable it, allowing the system to run normally.
>
> There are many other users of irq_set_chained_handler_and_data() in
> drivers/pci/controller/.  Should they be similarly converted?  If not,
> how do we decide which need to use irq_set_chained_handler_and_data()
> and which do not?

According to Thomas Gleixner, yes. Obviously I don't want to put words
in his mouth, but I think that's the gist of what he said in a reply to
an RFC patch that I sent a few weeks ago:
https://lore.kernel.org/all/877csohcll.ffs@tglx/

> > Signed-off-by: Radu Rendec <rrendec@xxxxxxxxxx>
> > ---
> >  .../pci/controller/dwc/pcie-designware-host.c | 35 +++++++++----------
> >  1 file changed, 17 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 9952057c8819c..b603796d415d7 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -83,18 +83,9 @@ irqreturn_t dw_handle_msi_irq(struct dw_pcie_rp *pp)
> >         return ret;
> >  }
> >  
> > -/* Chained MSI interrupt service routine */
> > -static void dw_chained_msi_isr(struct irq_desc *desc)
> > +static irqreturn_t dw_pcie_msi_isr(int irq, void *dev_id)
> >  {
> > -       struct irq_chip *chip = irq_desc_get_chip(desc);
> > -       struct dw_pcie_rp *pp;
> > -
> > -       chained_irq_enter(chip, desc);
> > -
> > -       pp = irq_desc_get_handler_data(desc);
> > -       dw_handle_msi_irq(pp);
> > -
> > -       chained_irq_exit(chip, desc);
> > +       return dw_handle_msi_irq(dev_id);
> >  }
> >  
> >  static void dw_pci_setup_msi_msg(struct irq_data *d, struct msi_msg *msg)
> > @@ -254,20 +245,21 @@ int dw_pcie_allocate_domains(struct dw_pcie_rp *pp)
> >         return 0;
> >  }
> >  
> > -static void dw_pcie_free_msi(struct dw_pcie_rp *pp)
> > +static void __dw_pcie_free_msi(struct dw_pcie_rp *pp, u32 num_ctrls)
> >  {
> >         u32 ctrl;
> >  
> > -       for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++) {
> > +       for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> >                 if (pp->msi_irq[ctrl] > 0)
> > -                       irq_set_chained_handler_and_data(pp->msi_irq[ctrl],
> > -                                                        NULL, NULL);
> > +                       free_irq(pp->msi_irq[ctrl], pp);
> >         }
> >  
> >         irq_domain_remove(pp->msi_domain);
> >         irq_domain_remove(pp->irq_domain);
> >  }
> >  
> > +#define dw_pcie_free_msi(pp) __dw_pcie_free_msi(pp, MAX_MSI_CTRLS)
>
> What is the benefit of the dw_pcie_free_msi() macro?

It allows me to add the num_ctrls parameter to the corresponding
function (now renamed to __dw_pcie_free_msi()) without forcing all the
existing call sites to send MAX_MSI_CTRLS explicitly.

I needed that extra parameter to avoid duplicating the tear down code
on the (new) error path in dw_pcie_msi_init() - see below.

> >  static void dw_pcie_msi_init(struct dw_pcie_rp *pp)
> >  {
> >         struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> > @@ -361,9 +353,16 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >                 return ret;
> >  
> >         for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > -               if (pp->msi_irq[ctrl] > 0)
> > -                       irq_set_chained_handler_and_data(pp->msi_irq[ctrl],
> > -                                                   dw_chained_msi_isr, pp);
> > +               if (pp->msi_irq[ctrl] > 0) {
> > +                       ret = request_irq(pp->msi_irq[ctrl], dw_pcie_msi_isr, 0,
> > +                                         dev_name(dev), pp);
> > +                       if (ret) {
> > +                               dev_err(dev, "Failed to request irq %d: %d\n",
> > +                                       pp->msi_irq[ctrl], ret);
> > +                               __dw_pcie_free_msi(pp, ctrl);

This is where I'm using the extra parameter. If we fail to request an
interrupt, we need to free all the other interrupts that we have
requested so far, to leave everything in a clean state. But we can't
use MAX_MSI_CTRLS with __dw_pcie_free_msi() and rely on the check there
because there may be extra interrupts that we haven't requested *yet*
and we would attempt to free them.

> > +                               return ret;
> > +                       }
> > +               }
> >         }
> >  
> >         /*
> > --
> > 2.41.0
> >
>