Re: [PATCH 3/3] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl

From: Jason Gunthorpe
Date: Mon Dec 11 2023 - 13:10:36 EST


On Mon, Dec 11, 2023 at 11:03:45AM -0700, Alex Williamson wrote:
> On Sun, 26 Nov 2023 22:39:09 -0800
> Yi Liu <yi.l.liu@xxxxxxxxx> wrote:
>
> > This reports the PASID capability data to userspace via VFIO_DEVICE_FEATURE,
> > hence userspace could probe PASID capability by it. This is a bit different
> > with other capabilities which are reported to userspace when the user reads
> > the device's PCI configuration space. There are two reasons for this.
> >
> > - First, Qemu by default exposes all available PCI capabilities in vfio-pci
> > config space to the guest as read-only, so adding PASID capability in the
> > vfio-pci config space will make it exposed to the guest automatically while
> > an old Qemu doesn't really support it.
>
> Shouldn't we also be working on hiding the PASID capability in QEMU
> ASAP? This feature only allows QEMU to know PASID control is actually
> available, not the guest. Maybe we're hoping this is really only used
> by VFs where there's no capability currently exposed to the guest?

Makes sense, yes

> > the PF). Creating a virtual PASID capability in vfio-pci config space needs
> > to find a hole to place it, but doing so may require device specific
> > knowledge to avoid potential conflict with device specific registers like
> > hiden bits in VF config space. It's simpler by moving this burden to the
> > VMM instead of maintaining a quirk system in the kernel.
>
> This feels a bit like an incomplete solution though and we might
> already posses device specific knowledge in the form of a variant
> driver. Should this feature structure include a flag + field that
> could serve to generically indicate to the VMM a location for
> implementing the PASID capability? The default core implementation
> might fill this only for PFs where clearly an emualted PASID capability
> can overlap the physical capability. Thanks,

In many ways I would perfer to solve this for good by having a way to
learn a range of available config space - I liked the suggestion to
use a DVSEC to mark empty space.

Jason