Re: [RFC] /dev/ioasid uAPI proposal

From: David Gibson
Date: Thu Jun 03 2021 - 02:28:42 EST


On Wed, Jun 02, 2021 at 01:37:53PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 02, 2021 at 04:57:52PM +1000, David Gibson wrote:
>
> > I don't think presence or absence of a group fd makes a lot of
> > difference to this design. Having a group fd just means we attach
> > groups to the ioasid instead of individual devices, and we no longer
> > need the bookkeeping of "partial" devices.
>
> Oh, I think we really don't want to attach the group to an ioasid, or
> at least not as a first-class idea.
>
> The fundamental problem that got us here is we now live in a world
> where there are many ways to attach a device to an IOASID:

I'm not seeing that that's necessarily a problem.

> - A RID binding
> - A RID,PASID binding
> - A RID,PASID binding for ENQCMD

I have to admit I haven't fully grasped the differences between these
modes. I'm hoping we can consolidate at least some of them into the
same sort of binding onto different IOASIDs (which may be linked in
parent/child relationships).

> - A SW TABLE binding
> - etc
>
> The selection of which mode to use is based on the specific
> driver/device operation. Ie the thing that implements the 'struct
> vfio_device' is the thing that has to select the binding mode.

I thought userspace selected the binding mode - although not all modes
will be possible for all devices.

> group attachment was fine when there was only one mode. As you say it
> is fine to just attach every group member with RID binding if RID
> binding is the only option.
>
> When SW TABLE binding was added the group code was hacked up - now the
> group logic is choosing between RID/SW TABLE in a very hacky and mdev
> specific way, and this is just a mess.

Sounds like it. What do you propose instead to handle backwards
compatibility for group-based VFIO code?

> The flow must carry the IOASID from the /dev/iommu to the vfio_device
> driver and the vfio_device implementation must choose which binding
> mode and parameters it wants based on driver and HW configuration.
>
> eg if two PCI devices are in a group then it is perfectly fine that
> one device uses RID binding and the other device uses RID,PASID
> binding.

Uhhhh... I don't see how that can be. They could well be in the same
group because their RIDs cannot be distinguished from each other.

> The only place I see for a "group bind" in the uAPI is some compat
> layer for the vfio container, and the implementation would be quite
> different, we'd have to call each vfio_device driver in the group and
> execute the IOASID attach IOCTL.
>
> > > I would say no on the container. /dev/ioasid == the container, having
> > > two competing objects at once in a single process is just a mess.
> >
> > Right. I'd assume that for compatibility, creating a container would
> > create a single IOASID under the hood with a compatiblity layer
> > translating the container operations to iosaid operations.
>
> It is a nice dream for sure
>
> /dev/vfio could be a special case of /dev/ioasid just with a different
> uapi and ending up with only one IOASID. They could be interchangable
> from then on, which would simplify the internals of VFIO if it
> consistently delt with these new ioasid objects everywhere. But last I
> looked it was complicated enough to best be done later on
>
> Jason
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature