Re: [PATCH RFC] cxl/pci: Skip irq features if irq's are not supported

From: Alison Schofield
Date: Tue Jan 09 2024 - 18:29:55 EST


On Mon, Jan 08, 2024 at 11:51:13PM -0800, Ira Weiny wrote:
> CXL 3.1 Section 3.1.1 states:
>
> "A Function on a CXL device must not generate INTx messages if
> that Function participates in CXL.cache protocol or CXL.mem
> protocols."
>
> The generic CXL memory driver only supports devices which use the
> CXL.mem protocol. The current driver attempts to allocate MSI/MSI-X
> vectors in anticipation of their need for mailbox interrupts or event
> processing. However, the above requirement does not require a device to
> support interrupts at all. A device may not use mailbox interrupts and
> may be configured for firmware first event processing.
>
> Rather than fail device probe if interrupts are not supported; flag such
> that irqs are not supported and do not enable features which require
> interrupts. dev_warn() in those cases which require interrupts but they
> were not supported.
>
> It is possible for a device to have host based event processing through
> polling but this patch does not support the addition of such polling.
> Leave that to the future if such a device comes along.
>
> Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> ---
> Compile tested only.
>
> This is an RFC based on errors seen by Dave Larson and reported on
> discord. Dan requested that the driver not fail if irqs are not
> required.
> ---
> drivers/cxl/cxlmem.h | 2 ++
> drivers/cxl/pci.c | 25 +++++++++++++++++++------
> 2 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a2fcbca253f3..422bc9657e5c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -410,6 +410,7 @@ enum cxl_devtype {
> * @ram_res: Active Volatile memory capacity configuration
> * @serial: PCIe Device Serial Number
> * @type: Generic Memory Class device or Vendor Specific Memory device
> + * @irq_supported: Flag if irqs are supported by the device
> */
> struct cxl_dev_state {
> struct device *dev;
> @@ -424,6 +425,7 @@ struct cxl_dev_state {
> struct resource ram_res;
> u64 serial;
> enum cxl_devtype type;
> + bool irq_supported;
> };
>
> /**
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 0155fb66b580..bb90ac011290 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -443,6 +443,12 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
> if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ))
> return 0;
>
> + if (!cxlds->irq_supported) {
> + dev_err(cxlds->dev, "Mailbox interrupts enabled but device indicates no interrupt vectors supported.\n");
> + dev_err(cxlds->dev, "Skip mailbox iterrupt configuration.\n");
> + return 0;
> + }
> +

Commit msg says dev_warn() yet here it is dev_err()

Can you fit in one msg, something like:
"Device does not support mailbox interrupts\n"

Perhaps skip the hard stops. No other dev_*() in this file adds them.
Documentation/process/coding-style.rst

Spellcheck


> msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
> if (irq < 0)
> @@ -587,7 +593,8 @@ static int cxl_mem_alloc_event_buf(struct cxl_memdev_state *mds)
> return devm_add_action_or_reset(mds->cxlds.dev, free_event_buf, buf);
> }
>
> -static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +static void cxl_alloc_irq_vectors(struct pci_dev *pdev,
> + struct cxl_dev_state *cxlds)
> {
> int nvecs;
>
> @@ -604,9 +611,10 @@ static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> PCI_IRQ_MSIX | PCI_IRQ_MSI);
> if (nvecs < 1) {
> dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> - return -ENXIO;
> + return;
> }
> - return 0;
> +
> + cxlds->irq_supported = true;
> }
>
> static irqreturn_t cxl_event_thread(int irq, void *id)
> @@ -754,6 +762,13 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
> if (!host_bridge->native_cxl_error)
> return 0;
>
> + /* Polling not supported */

I understand this comment while reading it in the context of this patch.
Lacking that context, maybe it deserves a bit more like you wrote in
the commit log. Be clear that it's the driver that is not supporting
polling, and when if or when the driver does add polling support they'll
be an alternative method for processing events. IIUC ;)


> + if (!mds->cxlds.irq_supported) {
> + dev_err(mds->cxlds.dev, "Host events enabled but device indicates no interrupt vectors supported.\n");
> + dev_err(mds->cxlds.dev, "Event polling is not supported, skip event processing.\n");
> + return 0;
> + }

Similar to above


> +
> rc = cxl_mem_alloc_event_buf(mds);
> if (rc)
> return rc;
> @@ -845,9 +860,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> else
> dev_warn(&pdev->dev, "Media not active (%d)\n", rc);
>
> - rc = cxl_alloc_irq_vectors(pdev);
> - if (rc)
> - return rc;
> + cxl_alloc_irq_vectors(pdev, cxlds);
>
> rc = cxl_pci_setup_mailbox(mds);
> if (rc)
>
> ---
> base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
> change-id: 20240108-dont-fail-irq-a96310368f0f
>
> Best regards,
> --
> Ira Weiny <ira.weiny@xxxxxxxxx>
>