Re: VFIO (PCI) and write combine mapping of BARs

From: Benjamin Herrenschmidt
Date: Tue Jul 25 2023 - 21:20:41 EST


On Tue, 2023-07-25 at 08:39 -0300, Jason Gunthorpe wrote:
> On Tue, Jul 25, 2023 at 04:15:39PM +1000, Benjamin Herrenschmidt wrote:
> > > Assuming this is for #2, I think VFIO has fallen into a bit of a trap
> > > by allowing userspace to form the mmap offset. I've seen this happen
> > > in other subsystems too. It seems like a good idea then you realize
> > > you need more stuff in the mmap space and become sad.
> > >
> > > Typically the way out is to covert the mmap offset into a cookie where
> > > userspace issues some ioctl and then the ioctl returns an opaque mmap
> > > offset to use.
> > >
> > > eg in the vfio context you'd do some 'prepare region for mmap' ioctl
> > > where you could specify flags. The kernel would encode the flags in
> > > the cookie and then mmap would do the right thing. Adding more stuff
> > > is done by enhancing the prepare ioctl.
> > >
> > > Legacy mmap offsets are kept working.
> >
> > This indeed what I have in mind. IE. VFIO has legacy regions and add-on
> > regions though the latter is currently only exploited by some drivers
> > that create their own add-on regions. My proposal is to add an ioctl to
> > create them from userspace as "children" of an existing driver-provided
> > region, allowing to set different attributes for mmap.
>
> I wouldn't call it children, you are just getting a different mmap
> cookie for the same region object.

I though they could be subsets but that might be overkill.

> > In the current VFIO the implementation is *entirely* in vfio_pci_core
> > for PCI and entirely in vfio_platform_common.c for platform, so while
> > the same ioctls could be imagined to create sub-regions, it would have
> > to be completely implemented twice unless we do a lot of heavy lifting
> > to move some of that region stuff into common code.
>
> The machinery for managing the mmap cookies should be in common code

Ok. I'll whip up a POC within vfio_pci only intially to test the
concept and to agree on the API, then look at how we can clean all that
up.

Cheers,
Ben.