Re: [RFC] /dev/ioasid uAPI proposal

From: Jason Gunthorpe
Date: Thu Jun 03 2021 - 16:10:23 EST


On Thu, Jun 03, 2021 at 02:01:46PM -0600, Alex Williamson wrote:

> > > > 1) Mixing IOMMU_CAP_CACHE_COHERENCY and !IOMMU_CAP_CACHE_COHERENCY
> > > > domains.
> > > >
> > > > This doesn't actually matter. If you mix them together then kvm
> > > > will turn on wbinvd anyhow, so we don't need to use the DMA_PTE_SNP
> > > > anywhere in this VM.
> > > >
> > > > This if two IOMMU's are joined together into a single /dev/ioasid
> > > > then we can just make them both pretend to be
> > > > !IOMMU_CAP_CACHE_COHERENCY and both not set IOMMU_CACHE.
> > >
> > > Yes and no. Yes, if any domain is !IOMMU_CAP_CACHE_COHERENCY then we
> > > need to emulate wbinvd, but no we'll use IOMMU_CACHE any time it's
> > > available based on the per domain support available. That gives us the
> > > most consistent behavior, ie. we don't have VMs emulating wbinvd
> > > because they used to have a device attached where the domain required
> > > it and we can't atomically remap with new flags to perform the same as
> > > a VM that never had that device attached in the first place.
> >
> > I think we are saying the same thing..
>
> Hrm? I think I'm saying the opposite of your "both not set
> IOMMU_CACHE". IOMMU_CACHE is the mapping flag that enables
> DMA_PTE_SNP. Maybe you're using IOMMU_CACHE as the state reported to
> KVM?

I'm saying if we enable wbinvd in the guest then no IOASIDs used by
that guest need to set DMA_PTE_SNP. If we disable wbinvd in the guest
then all IOASIDs must enforce DMA_PTE_SNP (or we otherwise guarentee
no-snoop is not possible).

This is not what VFIO does today, but it is a reasonable choice.

Based on that observation we can say as soon as the user wants to use
an IOMMU that does not support DMA_PTE_SNP in the guest we can still
share the IO page table with IOMMUs that do support DMA_PTE_SNP.

> > It doesn't solve the problem to connect kvm to AP and kvmgt though
>
> It does not, we'll probably need a vfio ioctl to gratuitously announce
> the KVM fd to each device. I think some devices might currently fail
> their open callback if that linkage isn't already available though, so
> it's not clear when that should happen, ie. it can't currently be a
> VFIO_DEVICE ioctl as getting the device fd requires an open, but this
> proposal requires some availability of the vfio device fd without any
> setup, so presumably that won't yet call the driver open callback.
> Maybe that's part of the attach phase now... I'm not sure, it's not
> clear when the vfio device uAPI starts being available in the process
> of setting up the ioasid. Thanks,

At a certain point we maybe just have to stick to backward compat, I
think. Though it is useful to think about green field alternates to
try to guide the backward compat design..

Jason