Re: [PATCH 3/3] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl

From: Yi Liu
Date: Mon Dec 11 2023 - 22:41:57 EST


On 2023/12/12 10:16, Tian, Kevin wrote:
From: Alex Williamson <alex.williamson@xxxxxxxxxx>
Sent: Tuesday, December 12, 2023 2:04 AM

On Sun, 26 Nov 2023 22:39:09 -0800
Yi Liu <yi.l.liu@xxxxxxxxx> wrote:

This reports the PASID capability data to userspace via
VFIO_DEVICE_FEATURE,
hence userspace could probe PASID capability by it. This is a bit different
with other capabilities which are reported to userspace when the user
reads
the device's PCI configuration space. There are two reasons for this.

- First, Qemu by default exposes all available PCI capabilities in vfio-pci
config space to the guest as read-only, so adding PASID capability in the
vfio-pci config space will make it exposed to the guest automatically while
an old Qemu doesn't really support it.

Shouldn't we also be working on hiding the PASID capability in QEMU
ASAP? This feature only allows QEMU to know PASID control is actually
available, not the guest. Maybe we're hoping this is really only used
by VFs where there's no capability currently exposed to the guest?

We expect this to be used by both PF/VF. It doesn't make sense to have
separate interfaces between them.

I'm not aware of that the PASID capability has been exported today. So
yes we should fix QEMU asap. and also remove the line exposing it
in vfio_pci_config.c.

Kernel side hides the PASID capability by setting its length as 0
in the below array. As a result, QEMU wont see it in the cap chain.
Do you mean we need to let QEMU always ignore it even if kernel side
does not hide it?

static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = {
...
[PCI_EXT_CAP_ID_PASID] = 0, /* not yet */
...
};

So far, kernel is still hiding it.



- Second, PASID capability does not exit on VFs (instead shares the cap of

s/exit/exist/

the PF). Creating a virtual PASID capability in vfio-pci config space needs
to find a hole to place it, but doing so may require device specific
knowledge to avoid potential conflict with device specific registers like
hiden bits in VF config space. It's simpler by moving this burden to the
VMM instead of maintaining a quirk system in the kernel.

This feels a bit like an incomplete solution though and we might
already posses device specific knowledge in the form of a variant
driver. Should this feature structure include a flag + field that
could serve to generically indicate to the VMM a location for
implementing the PASID capability? The default core implementation
might fill this only for PFs where clearly an emualted PASID capability
can overlap the physical capability. Thanks,


make sense

A location maybe not enough, may also need to know if any successive
cap, so that we can insert the capability into the cap chain.

--
Regards,
Yi Liu