Re: VFIO (PCI) and write combine mapping of BARs

From: Jason Gunthorpe
Date: Fri Jul 14 2023 - 08:38:03 EST


On Fri, Jul 14, 2023 at 09:13:27AM +0200, Lorenzo Pieralisi wrote:
> [+Catalin, Marc, Jason]
>
> On Fri, Jul 14, 2023 at 12:32:49PM +1000, Benjamin Herrenschmidt wrote:
> > Hi Folks !
> >
> > I'd like to revive an old discussion as we (Amazon Linux) have been
> > getting asks for it.
> >
> > What's the best interface to provide the option of write combine mmap's
> > of BARs via VFIO ?
>
> There is an ongoing thread on this topic that we should use to
> bring this discussion to completion:
>
> https://lore.kernel.org/linux-arm-kernel/ZHcxHbCb439I1Uk2@xxxxxxx

There are two topics here

1) Make ARM KVM allow the VM to select WC for its MMIO. This has
evolved in a way that is not related to VFIO

2) Allow VFIO to create mmaps with WC for non-VM use cases like DPDK.

We have a draft patch for #1, and I think a general understanding with
ARM folks that this is the right direction.

2 is more like what this email talks about - providing mmaps with
specific flags.

Benjamin, which are you interested in?

> > The problem isn't so much the low level implementation, we just have to
> > play with the pgprot, the question is more around what API to present
> > to control this.

Assuming this is for #2, I think VFIO has fallen into a bit of a trap
by allowing userspace to form the mmap offset. I've seen this happen
in other subsystems too. It seems like a good idea then you realize
you need more stuff in the mmap space and become sad.

Typically the way out is to covert the mmap offset into a cookie where
userspace issues some ioctl and then the ioctl returns an opaque mmap
offset to use.

eg in the vfio context you'd do some 'prepare region for mmap' ioctl
where you could specify flags. The kernel would encode the flags in
the cookie and then mmap would do the right thing. Adding more stuff
is done by enhancing the prepare ioctl.

Legacy mmap offsets are kept working.

> > This is still quite specific to PCI, but so is the entire regions
> > mechanism, so I don't see an easy path to something more generic at
> > this stage.

Regions are general, but the encoding of the mmap cookie has various
PCI semantics when used with the PCI interface..

We'd want the same ability with platform devices too, for instance.

Jason