Re: [PATCH 3/3] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl

From: Alex Williamson
Date: Mon Dec 11 2023 - 13:03:56 EST


On Sun, 26 Nov 2023 22:39:09 -0800
Yi Liu <yi.l.liu@xxxxxxxxx> wrote:

> This reports the PASID capability data to userspace via VFIO_DEVICE_FEATURE,
> hence userspace could probe PASID capability by it. This is a bit different
> with other capabilities which are reported to userspace when the user reads
> the device's PCI configuration space. There are two reasons for this.
>
> - First, Qemu by default exposes all available PCI capabilities in vfio-pci
> config space to the guest as read-only, so adding PASID capability in the
> vfio-pci config space will make it exposed to the guest automatically while
> an old Qemu doesn't really support it.

Shouldn't we also be working on hiding the PASID capability in QEMU
ASAP? This feature only allows QEMU to know PASID control is actually
available, not the guest. Maybe we're hoping this is really only used
by VFs where there's no capability currently exposed to the guest?

> - Second, PASID capability does not exit on VFs (instead shares the cap of

s/exit/exist/

> the PF). Creating a virtual PASID capability in vfio-pci config space needs
> to find a hole to place it, but doing so may require device specific
> knowledge to avoid potential conflict with device specific registers like
> hiden bits in VF config space. It's simpler by moving this burden to the
> VMM instead of maintaining a quirk system in the kernel.

This feels a bit like an incomplete solution though and we might
already posses device specific knowledge in the form of a variant
driver. Should this feature structure include a flag + field that
could serve to generically indicate to the VMM a location for
implementing the PASID capability? The default core implementation
might fill this only for PFs where clearly an emualted PASID capability
can overlap the physical capability. Thanks,

Alex

> Suggested-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Signed-off-by: Yi Liu <yi.l.liu@xxxxxxxxx>
> ---
> drivers/vfio/pci/vfio_pci_core.c | 47 ++++++++++++++++++++++++++++++++
> include/uapi/linux/vfio.h | 13 +++++++++
> 2 files changed, 60 insertions(+)
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 1929103ee59a..8038aa45500e 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1495,6 +1495,51 @@ static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags,
> return 0;
> }
>
> +static int vfio_pci_core_feature_pasid(struct vfio_device *device, u32 flags,
> + struct vfio_device_feature_pasid __user *arg,
> + size_t argsz)
> +{
> + struct vfio_pci_core_device *vdev =
> + container_of(device, struct vfio_pci_core_device, vdev);
> + struct vfio_device_feature_pasid pasid = { 0 };
> + struct pci_dev *pdev = vdev->pdev;
> + u32 capabilities = 0;
> + int ret;
> +
> + /* We do not support SET of the PASID capability */
> + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
> + sizeof(pasid));
> + if (ret != 1)
> + return ret;
> +
> + /*
> + * Needs go to PF if the device is VF as VF shares its PF's
> + * PASID Capability.
> + */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + if (!pdev->pasid_enabled)
> + goto out;
> +
> +#ifdef CONFIG_PCI_PASID
> + pci_read_config_dword(pdev, pdev->pasid_cap + PCI_PASID_CAP,
> + &capabilities);
> +#endif
> +
> + if (capabilities & PCI_PASID_CAP_EXEC)
> + pasid.capabilities |= VFIO_DEVICE_PASID_CAP_EXEC;
> + if (capabilities & PCI_PASID_CAP_PRIV)
> + pasid.capabilities |= VFIO_DEVICE_PASID_CAP_PRIV;
> +
> + pasid.width = (capabilities >> 8) & 0x1f;
> +
> +out:
> + if (copy_to_user(arg, &pasid, sizeof(pasid)))
> + return -EFAULT;
> + return 0;
> +}
> +
> int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
> void __user *arg, size_t argsz)
> {
> @@ -1508,6 +1553,8 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
> return vfio_pci_core_pm_exit(device, flags, arg, argsz);
> case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
> return vfio_pci_core_feature_token(device, flags, arg, argsz);
> + case VFIO_DEVICE_FEATURE_PASID:
> + return vfio_pci_core_feature_pasid(device, flags, arg, argsz);
> default:
> return -ENOTTY;
> }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 495193629029..8326faf8622b 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1512,6 +1512,19 @@ struct vfio_device_feature_bus_master {
> };
> #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
>
> +/**
> + * Upon VFIO_DEVICE_FEATURE_GET, return the PASID capability for the device.
> + * Zero width means no support for PASID.
> + */
> +struct vfio_device_feature_pasid {
> + __u16 capabilities;
> +#define VFIO_DEVICE_PASID_CAP_EXEC (1 << 0)
> +#define VFIO_DEVICE_PASID_CAP_PRIV (1 << 1)
> + __u8 width;
> + __u8 __reserved;
> +};
> +#define VFIO_DEVICE_FEATURE_PASID 11
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**