Re: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

From: Ira Weiny
Date: Sun Oct 23 2022 - 20:10:55 EST


On Sat, Oct 22, 2022 at 03:05:45PM -0700, Dan Williams wrote:
> Davidlohr Bueso wrote:
> > Introduce a generic irq table for CXL components/features that can have
> > standard irq support - DOE requires dynamic vector sizing and is not
> > considered here. For now the table is empty.
> >
> > Create an infrastructure to query the max vectors required for the CXL
> > device. Upon successful allocation, users can plug in their respective isr
> > at any point thereafter, which is supported by a new cxlds->has_irq flag,
> > for example, if the irq setup is not done in the PCI driver, such as
> > the case of the CXL-PMU.
> >
> > Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> > Signed-off-by: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> > ---
> > drivers/cxl/cxlmem.h | 3 ++
> > drivers/cxl/pci.c | 72 ++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 75 insertions(+)
> >
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 88e3a8e54b6a..72b69b003302 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> > * @info: Cached DVSEC information about the device.
> > * @serial: PCIe Device Serial Number
> > * @doe_mbs: PCI DOE mailbox array
> > + * @has_irq: PCIe MSI-X/MSI support
> > * @mbox_send: @dev specific transport for transmitting mailbox commands
> > *
> > * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > @@ -247,6 +248,8 @@ struct cxl_dev_state {
> >
> > struct xarray doe_mbs;
> >
> > + bool has_irq;
> > +
> > int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > };
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index faeb5d9d7a7a..9c3e95ebaa26 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -428,6 +428,73 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> > }
> > }
> >
> > +/**
> > + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI-X/MSI irqs.
> > + *
> > + * @name: Name of the device/component generating this interrupt.
> > + * @get_max_msgnum: Get the feature's largest interrupt message number. If the
> > + * feature does not have the Interrupt Supported bit set, then
> > + * return -1.
> > + */
> > +struct cxl_irq_cap {
> > + const char *name;
> > + int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
>
> Why is this a callback, why not just have the features populate their
> irq numbers?

I think we have decided to forgo the callback but I'm not sure what you mean by
'populate their irq numbers'?

>
> > +};
> > +
> > +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> > + NULL
> > +};
> > +
> > +static void cxl_pci_free_irq_vectors(void *data)
> > +{
> > + pci_free_irq_vectors(data);
> > +}
> > +
> > +/*
> > + * Attempt to allocate the largest amount of necessary vectors.
> > + *
> > + * Returns 0 upon a successful allocation of *all* vectors, or a
> > + * negative value otherwise.
> > + */
> > +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > +{
> > + struct device *dev = cxlds->dev;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + int rc, i, vectors = -1;
> > +
> > + for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> > + int irq;
> > +
> > + if (!cxl_irq_cap_table[i].get_max_msgnum)
> > + continue;
> > +
> > + irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> > + vectors = max_t(int, irq, vectors);
> > + }
>
> Forgive me if I have missed something, I only look at interrupt enable
> code once every few years, and the APIs are always a bit different, but
> is this not too early to read the message number? The number is not
> stable until either MSI or MSI-X has been selected below at
> pci_alloc_irq_vectors() time?

Well I keep getting wrapped around the axle on this one too.

This all started back when Jonathan originally attempted to allocate the
maximum number of vectors a device _could_ allocate. But it was recommended that
we determine the max number first then allocate that number.

This seems like a chicken and egg issue. How is the number not stable before
calling pci_alloc_irq_vectors() when you need the max msg number in that call?

The Event code seems pretty simple because we use a mailbox command to
configure which logs to use irq's and the device reports the message number for
each. Thus we can determine the max message number Events need without
enabling anything. But your comment about them not being stable now has me
more worried... :-(

Confused,
Ira

> > +
> > + /*
> > + * Semantically lack of irq support is not an error, but we
> > + * still fail to allocate, so return negative.
> > + */
> > + if (vectors == -1)
> > + return -1;
> > +
> > + vectors++;
> > + rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> > + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > + if (rc < 0)
> > + return rc;
> > +
> > + if (rc != vectors) {
> > + dev_dbg(dev, "Not enough interrupts; use polling instead.\n");
> > + /* some got allocated, clean them up */
> > + cxl_pci_free_irq_vectors(pdev);
> > + return -ENOSPC;
> > + }
> > +
> > + return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> > +}
> > +
> > static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > {
> > struct cxl_register_map map;
> > @@ -494,6 +561,11 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > if (rc)
> > return rc;
> >
> > + if (!cxl_pci_alloc_irq_vectors(cxlds)) {
> > + cxlds->has_irq = true;
> > + } else
> > + cxlds->has_irq = false;
> > +
> > cxlmd = devm_cxl_add_memdev(cxlds);
> > if (IS_ERR(cxlmd))
> > return PTR_ERR(cxlmd);
> > --
> > 2.38.0
> >
>
>