Re: [PATCH v2 00/11] iommufd: Add nesting infrastructure

From: Jason Gunthorpe
Date: Tue Jun 06 2023 - 10:18:09 EST


On Wed, May 24, 2023 at 03:48:43AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Friday, May 19, 2023 7:50 PM
> >
> > On Fri, May 19, 2023 at 09:56:04AM +0000, Tian, Kevin wrote:
> > > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > > > Sent: Thursday, May 11, 2023 10:39 PM
> > > >
> > > > Lu Baolu (2):
> > > > iommu: Add new iommu op to create domains owned by userspace
> > > > iommu: Add nested domain support
> > > >
> > > > Nicolin Chen (5):
> > > > iommufd/hw_pagetable: Do not populate user-managed hw_pagetables
> > > > iommufd/selftest: Add domain_alloc_user() support in iommu mock
> > > > iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with user
> > data
> > > > iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
> > > > iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
> > > >
> > > > Yi Liu (4):
> > > > iommufd/hw_pagetable: Use domain_alloc_user op for domain
> > allocation
> > > > iommufd: Pass parent hwpt and user_data to
> > > > iommufd_hw_pagetable_alloc()
> > > > iommufd: IOMMU_HWPT_ALLOC allocation with user data
> > > > iommufd: Add IOMMU_HWPT_INVALIDATE
> > > >
> > >
> > > I didn't see any change in iommufd_hw_pagetable_attach() to handle
> > > stage-1 hwpt differently.
> > >
> > > In concept whatever reserved regions existing on a device should be
> > > directly reflected on the hwpt which the device is attached to.
> > >
> > > So with nesting presumably the reserved regions of the device have
> > > been reported to the userspace and it's user's responsibility to avoid
> > > allocating IOVA from those reserved regions in stage-1 hwpt.
> >
> > Presumably
> >
> > > It's not necessarily to add reserved regions to the IOAS of the parent
> > > hwpt since the device doesn't access that address space after it's
> > > attached to stage-1. The parent is used only for address translation
> > > in the iommu side.
> >
> > But if we don't put them in the IOAS of the parent there is no way for
> > userspace to learn what they are to forward to the VM ?
>
> emmm I wonder whether that is the right interface to report
> per-device reserved regions.

The iommu driver needs to report different reserved regions for the S1
and S2 iommu_domains, and the IOAS should only get the reserved
regions for the S2.

Currently the API has no way to report per-domain reserved regions and
that is possibly OK for now. The S2 really doesn't have reserved
regions beyond the domain aperture.

So an ioctl to directly query the reserved regions for a dev_id makes
sense.

> > Since we expect the parent IOAS to be usable in an identity mode I
> > think they should be added, at least I can't see a reason not to add
> > them.
>
> this is a good point.

But it mixes things

The S2 doesn't have reserved ranges restrictions, we always have some
model of a S1, even for identity mode, that would carry the reserved
ranges.

> With that it makes more sense to make it a vendor specific choice.

It isn't vendor specific, the ranges come from the domain that is
attached to the IOAS, and we simply don't import ranges for a S2
domain.

Jason