Re: [PATCH V4 10/11] vfio/pci: Support dynamic MSI-X

From: Reinette Chatre
Date: Fri May 05 2023 - 13:21:35 EST


Hi Kevin,

On 5/5/2023 1:10 AM, Tian, Kevin wrote:
>> From: Chatre, Reinette <reinette.chatre@xxxxxxxxx>
>> Sent: Saturday, April 29, 2023 2:35 AM
>> On 4/27/2023 11:50 PM, Tian, Kevin wrote:
>>>> From: Chatre, Reinette <reinette.chatre@xxxxxxxxx>
>>>> Sent: Friday, April 28, 2023 1:36 AM

...

>>>> +/*
>>>> + * Return Linux IRQ number of an MSI or MSI-X device interrupt vector.
>>>> + * If a Linux IRQ number is not available then a new interrupt will be
>>>> + * allocated if dynamic MSI-X is supported.
>>>> + */
>>>> +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev,
>>>> + unsigned int vector, bool msix)
>>>> +{
>>>> + struct pci_dev *pdev = vdev->pdev;
>>>> + struct msi_map map;
>>>> + int irq;
>>>> + u16 cmd;
>>>> +
>>>> + irq = pci_irq_vector(pdev, vector);
>>>> + if (irq > 0 || !msix || !vdev->has_dyn_msix)
>>>> + return irq;
>>>
>>> if (irq >= 0 || ...)
>>>
>>
>> I am not sure about this request because pci_irq_vector() cannot return 0.
>> The Linux interrupt number will be > 0 on success. 0 means "not found"
>> (see msi_get_virq()), which is translated to -EINVAL by pci_irq_vector().
>>
>
> There is a subtle difference between the description and the code of
> pci_irq_vector().
>
> /**
> * pci_irq_vector() - Get Linux IRQ number of a device interrupt vector
> * @dev: the PCI device to operate on
> * @nr: device-relative interrupt vector index (0-based); has different
> * meanings, depending on interrupt mode:
> *
> * * MSI-X the index in the MSI-X vector table
> * * MSI the index of the enabled MSI vectors
> * * INTx must be 0
> *
> * Return: the Linux IRQ number, or -EINVAL if @nr is out of range
> */
>
> From above '0' is a valid irq number.
>
> then in following code:
>
> irq = msi_get_virq(&dev->dev, nr);
> return irq ? irq : -EINVAL;
>
> '0' is obviously invalid for msi.
>
> I didn't realize the msi part when reading the patch. It left me in
> confusion that '0' is unhandled as here we only check ">0" while in
> other places "-EINVAL" is checked.
>
> Not big matter but it sounds slightly clearer to me to follow the
> description of pci_irq_vector() instead of its internal detail.

I can add an explicit check for '0' and, as you confirmed, this is
invalid for MSI and thus I think it should be treated as an error.
This is perhaps another candidate for a WARN considering that
pci_irq_vector() returning a '0' for MSI indicates a kernel problem .

I now consider taking guidance from pci_irq_get_affinity(). Note that
pci_irq_get_affinity() contains:

const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
{
int idx, irq = pci_irq_vector(dev, nr);
...
if (WARN_ON_ONCE(irq <= 0))
return NULL;
...
}


Would you be ok with something like below?

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index b549f5c97cb8..a8e96254f953 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -393,6 +393,8 @@ static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev,
u16 cmd;

irq = pci_irq_vector(pdev, vector);
+ if (WARN_ON_ONCE(irq == 0))
+ return -EINVAL;
if (irq > 0 || !msix || !vdev->has_dyn_msix)
return irq;

I would prefer that vfio_msi_alloc_irq() returns negative errors. This enables
callers to in turn just return the error code on failure (note that dynamic
allocation can return different error codes), not needing to translate 0 into
an error.

Reinette