Re: [RFC] /dev/ioasid uAPI proposal

From: Jason Gunthorpe
Date: Thu Jun 03 2021 - 09:05:25 EST


On Thu, Jun 03, 2021 at 06:39:30AM +0000, Tian, Kevin wrote:
> > > Two helper functions are provided to support VFIO_ATTACH_IOASID:
> > >
> > > struct attach_info {
> > > u32 ioasid;
> > > // If valid, the PASID to be used physically
> > > u32 pasid;
> > > };
> > > int ioasid_device_attach(struct ioasid_dev *dev,
> > > struct attach_info info);
> > > int ioasid_device_detach(struct ioasid_dev *dev, u32 ioasid);
> >
> > Honestly, I still prefer this to be highly explicit as this is where
> > all device driver authors get invovled:
> >
> > ioasid_pci_device_attach(struct pci_device *pdev, struct ioasid_dev *dev,
> > u32 ioasid);
> > ioasid_pci_device_pasid_attach(struct pci_device *pdev, u32 *physical_pasid,
> > struct ioasid_dev *dev, u32 ioasid);
>
> Then better naming it as pci_device_attach_ioasid since the 1st parameter
> is struct pci_device?

No, the leading tag indicates the API's primary subystem, in this case
it is iommu (and if you prefer list the iommu related arguments first)

> By keeping physical_pasid as a pointer, you want to remove the last helper
> function (ioasid_get_global_pasid) so the global pasid is returned along
> with the attach function?

It is just a thought.. It allows the caller to both specify a fixed
PASID and request an allocation

I still dont have a clear idea how all this PASID complexity should
work, sorry.

> > > The actual policy depends on pdev vs. mdev, and whether ENQCMD is
> > > supported. There are three possible scenarios:
> > >
> > > (Note: /dev/ioasid uAPI is not affected by underlying PASID virtualization
> > > policies.)
> >
> > This has become unclear. I think this should start by identifying the
> > 6 main type of devices and how they can use pPASID/vPASID:
> >
> > 0) Device is a RID and cannot issue PASID
> > 1) Device is a mdev and cannot issue PASID
> > 2) Device is a mdev and programs a single fixed PASID during bind,
> > does not accept PASID from the guest
>
> There are no vPASID per se in above 3 types. So this section only
> focus on the latter 3 types. But I can include them in next version
> if it sets the tone clearer.

I think it helps

> >
> > 3) Device accepts any PASIDs from the guest. No
> > vPASID/pPASID translation is possible. (classic vfio_pci)
> > 4) Device accepts any PASID from the guest and has an
> > internal vPASID/pPASID translation (enhanced vfio_pci)
>
> what is enhanced vfio_pci? In my writing this is for mdev
> which doesn't support ENQCMD

This is a vfio_pci that mediates some element of the device interface
to communicate the vPASID/pPASID table to the device, using Max's
series for vfio_pci drivers to inject itself into VFIO.

For instance a device might send a message through the PF that the VF
has a certain vPASID/pPASID translation table. This would be useful
for devices that cannot use ENQCMD but still want to support migration
and thus need vPASID.

> for 0-2 the device will report no PASID support. Although this may duplicate
> with other information (e.g. PCI PASID cap), this provides a vendor-agnostic
> way for reporting details around IOASID.

We have to consider mdevs too here, so PCI caps are not general enough

> for 3-5 the device will report PASID support. In these cases the user is
> expected to always provide a vPASID.
>
> for 5 in addition the device will report a requirement on CPU PASID
> translation. For such device the user should talk to KVM to setup the PASID
> mapping. This way the user doesn't need to know whether a device is
> pdev or mdev. Just follows what device capability reports.

Something like that. Needs careful documentation

Jason