RE: [PATCH 3/3] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl

From: Tian, Kevin
Date: Tue Dec 12 2023 - 21:10:45 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Tuesday, December 12, 2023 11:35 PM
>
> On Mon, Dec 11, 2023 at 11:49:49AM -0700, Alex Williamson wrote:
> > On Mon, 11 Dec 2023 14:10:28 -0400
> > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >
> > > On Mon, Dec 11, 2023 at 11:03:45AM -0700, Alex Williamson wrote:
> > > > On Sun, 26 Nov 2023 22:39:09 -0800
> > > > Yi Liu <yi.l.liu@xxxxxxxxx> wrote:
> >
> > > > > the PF). Creating a virtual PASID capability in vfio-pci config space
> needs
> > > > > to find a hole to place it, but doing so may require device specific
> > > > > knowledge to avoid potential conflict with device specific registers
> like
> > > > > hiden bits in VF config space. It's simpler by moving this burden to
> the
> > > > > VMM instead of maintaining a quirk system in the kernel.
> > > >
> > > > This feels a bit like an incomplete solution though and we might
> > > > already posses device specific knowledge in the form of a variant
> > > > driver. Should this feature structure include a flag + field that
> > > > could serve to generically indicate to the VMM a location for
> > > > implementing the PASID capability? The default core implementation
> > > > might fill this only for PFs where clearly an emualted PASID capability
> > > > can overlap the physical capability. Thanks,
> > >
> > > In many ways I would perfer to solve this for good by having a way to
> > > learn a range of available config space - I liked the suggestion to
> > > use a DVSEC to mark empty space.
> >
> > Yes, DVSEC is the most plausible option for the device itself to convey
> > unused config space, but that requires hardware adoption so presumably
> > we're going to need to fill the gaps with device specific code. That
> > code might live in a variant driver or in the VMM. If we have faith
> > that DVSEC is the way, it'd make sense for a variant driver to
> > implement a virtual DVSEC to work out the QEMU implementation and set
> a
> > precedent.
>
> How hard do you think it would be for the kernel to synthesize the
> dvsec if the varient driver can provide a range for it?
>
> On the other hand I'm not so keen on having variant drivers that are
> only doing this just to avoid a table in qemu :\ It seems like a

me too. If we really want something like this I'd prefer to tracking a
table of device specific ranges instead of requesting full-fledged
variant drivers.