Re: Question about reserved_regions w/ Intel IOMMU

From: Jason Gunthorpe
Date: Tue Jun 13 2023 - 11:54:30 EST


On Thu, Jun 08, 2023 at 04:28:24PM +0100, Robin Murphy wrote:

> > The iova_reserve_pci_windows() you've seen is for kernel DMA interfaces
> > which is not related to peer-to-peer accesses.
>
> Right, in general the IOMMU driver cannot be held responsible for whatever
> might happen upstream of the IOMMU input.

The driver yes, but..

> The DMA layer carves PCI windows out of its IOVA space
> unconditionally because we know that they *might* be problematic,
> and we don't have any specific constraints on our IOVA layout so
> it's no big deal to just sacrifice some space for simplicity.

This is a problem for everything using UNMANAGED domains. If the iommu
API user picks an IOVA it should be able to expect it to work. If the
intereconnect fails to allow it to work then this has to be discovered
otherwise UNAMANGED domains are not usable at all.

Eg vfio and iommufd are also in trouble on these configurations.

We shouldn't expect every iommu user to fix this entirely on their
own.

> We don't want to have to go digging any further into bus-specific
> code to reason about whether the right ACS capabilities are present
> and enabled everywhere to prevent direct P2P or not. Other use-cases
> may have different requirements, though, so it's up to them what
> they want to do.

I agree the dma-iommu stuff doesn't have to be as precise as other
places might want (but also wouldn't be harmed by being more precise)

But I can't think of any place that can just ignore this and still be
correct..

So, I think it make sense that the iommu driver not be involved, but
IMHO the core code should have APIs to report IOVA that doesn't work
and every user of UNMANAGED domains needs to check it.

IOW it should probably come out of the existing reserved regions
interface.

Jason