Re: [RFC] /dev/ioasid uAPI proposal

From: David Gibson
Date: Thu Jun 17 2021 - 03:22:20 EST


On Thu, Jun 03, 2021 at 08:12:27AM +0000, Tian, Kevin wrote:
> > From: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>
> > Sent: Wednesday, June 2, 2021 2:15 PM
> >
> [...]
>
> > >
> > > /*
> > > * Get information about an I/O address space
> > > *
> > > * Supported capabilities:
> > > * - VFIO type1 map/unmap;
> > > * - pgtable/pasid_table binding
> > > * - hardware nesting vs. software nesting;
> > > * - ...
> > > *
> > > * Related attributes:
> > > * - supported page sizes, reserved IOVA ranges (DMA mapping);
> >
> > Can I request we represent this in terms of permitted IOVA ranges,
> > rather than reserved IOVA ranges. This works better with the "window"
> > model I have in mind for unifying the restrictions of the POWER IOMMU
> > with Type1 like mapping.
>
> Can you elaborate how permitted range work better here?

Pretty much just that MAP operations would fail if they don't entirely
lie within a permitted range. So, for example if your IOMMU only
implements say, 45 bits of IOVA, then you'd have 0..0x1fffffffffff as
your only permitted range. If, like the POWER paravirtual IOMMU (in
defaut configuration) you have a small (1G) 32-bit range and a large
(45-bit) 64-bit range at a high address, you'd have say:
0x00000000..0x3fffffff (32-bit range)
and
0x800000000000000 .. 0x8001fffffffffff (64-bit range)
as your permitted ranges.

If your IOMMU supports truly full 64-bit addressing, but has a
reserved range (for MSIs or whatever) at 0xaaaa000..0xbbbb0000 then
you'd have permitted ranges of 0..0xaaa9ffff and
0xbbbb0000..0xffffffffffffffff.

[snip]
> > For debugging and certain hypervisor edge cases it might be useful to
> > have a call to allow userspace to lookup and specific IOVA in a guest
> > managed pgtable.
>
> Since all the mapping metadata is from userspace, why would one
> rely on the kernel to provide such service? Or are you simply asking
> for some debugfs node to dump the I/O page table for a given
> IOASID?

I'm thinking of this as a debugging aid so you can make sure that how
the kernel is interpreting that metadata in the same way that your
userspace expects it to interpret that metadata.


--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature