Re: [PATCH 3/3] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl

From: Yi Liu
Date: Mon Dec 11 2023 - 22:51:32 EST


On 2023/12/12 11:39, Alex Williamson wrote:
On Tue, 12 Dec 2023 02:43:20 +0000
"Duan, Zhenzhong" <zhenzhong.duan@xxxxxxxxx> wrote:

-----Original Message-----
From: Alex Williamson <alex.williamson@xxxxxxxxxx>
Sent: Tuesday, December 12, 2023 2:04 AM
Subject: Re: [PATCH 3/3] vfio: Report PASID capability via
VFIO_DEVICE_FEATURE ioctl

On Sun, 26 Nov 2023 22:39:09 -0800
Yi Liu <yi.l.liu@xxxxxxxxx> wrote:
This reports the PASID capability data to userspace via
VFIO_DEVICE_FEATURE,
hence userspace could probe PASID capability by it. This is a bit different
with other capabilities which are reported to userspace when the user
reads
the device's PCI configuration space. There are two reasons for this.

- First, Qemu by default exposes all available PCI capabilities in vfio-pci
config space to the guest as read-only, so adding PASID capability in the
vfio-pci config space will make it exposed to the guest automatically
while
an old Qemu doesn't really support it.

Shouldn't we also be working on hiding the PASID capability in QEMU
ASAP? This feature only allows QEMU to know PASID control is actually
available, not the guest. Maybe we're hoping this is really only used
by VFs where there's no capability currently exposed to the guest?

PASID capability is not exposed to QEMU through config space,
VFIO_DEVICE_FEATURE ioctl is the only interface to expose PASID
cap to QEMU for both PF and VF.

/*
* Lengths of PCIe/PCI-X Extended Config Capabilities
* 0: Removed or masked from the user visible capability list
* FF: Variable length
*/
static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = {
...
[PCI_EXT_CAP_ID_PASID] = 0, /* not yet */
}

Ah, thanks. The comment made me think is was already exposed and I
didn't double check. So we really just want to convey the information
of the PASID capability outside of config space because if we pass the
capability itself existing userspace will blindly expose a read-only
version to the guest. That could be better explained in the commit log
and comments.

aha, yes. It was mentioned there, but seems not quite clear. Will refine. :)

- First, Qemu by default exposes all available PCI capabilities in vfio-pci
config space to the guest as read-only, so adding PASID capability in the
vfio-pci config space will make it exposed to the guest automatically while
an old Qemu doesn't really support it.


So how do we keep up with PCIe spec updates relative to the PASID
capability with this proposal? Would it make more sense to report the
raw capability register and capability version rather that a translated
copy thereof? Perhaps just masking the fields we're currently prepared
to expose. Thanks,

I have a minor concern on reporting raw capability register and capability
version. In this way, an old host kernel (supports version 1 pasid cap)
running on top of new hw which supports say version 2 pasid capability, the
VM would see the new capabilities that host kernel does not know. Is it
good?

--
Regards,
Yi Liu