RE: [PATCH v3 07/17] iommufd: Add IOMMU_RESV_IOVA_RANGES

From: Liu, Yi L
Date: Mon Jul 31 2023 - 05:54:48 EST


> From: Tian, Kevin <kevin.tian@xxxxxxxxx>
> Sent: Monday, July 31, 2023 2:22 PM
>
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Saturday, July 29, 2023 1:45 AM
> >
> > On Mon, Jul 24, 2023 at 04:03:56AM -0700, Yi Liu wrote:
> >
> > > +/**
> > > + * struct iommu_resv_iova_ranges - ioctl(IOMMU_RESV_IOVA_RANGES)
> > > + * @size: sizeof(struct iommu_resv_iova_ranges)
> > > + * @dev_id: device to read resv iova ranges for
> > > + * @num_iovas: Input/Output total number of resv ranges for the device
> > > + * @__reserved: Must be 0
> > > + * @resv_iovas: Pointer to the output array of struct
> > iommu_resv_iova_range
> > > + *
> > > + * Query a device for ranges of reserved IOVAs. num_iovas will be set to
> > the
> > > + * total number of iovas and the resv_iovas[] will be filled in as space
> > > + * permits.
> > > + *
> > > + * On input num_iovas is the length of the resv_iovas array. On output it is
> > > + * the total number of iovas filled in. The ioctl will return -EMSGSIZE and
> > > + * set num_iovas to the required value if num_iovas is too small. In this
> > > + * case the caller should allocate a larger output array and re-issue the
> > > + * ioctl.
> > > + *
> > > + * Under nested translation, userspace should query the reserved IOVAs
> > for a
> > > + * given device, and report it to the stage-1 I/O page table owner to
> > exclude
> > > + * the reserved IOVAs. The reserved IOVAs can also be used to figure out
> > the
> > > + * allowed IOVA ranges for the IOAS that the device is attached to. For
> > detail
> > > + * see ioctl IOMMU_IOAS_IOVA_RANGES.
> >
> > I'm not sure I like this, the other APIs here work with the *allowed*
> > IOVAs, which is the inverse of this one which works with the
> > *disallowed* IOVAs.
> >
> > It means we can't take the output of this API and feed it into
> > IOMMUFD_CMD_IOAS_ALLOW_IOVAS.. Though I suppose qemu isn't going
> > to do
> > this anyhow.
> >
> > On the other hand, it is kind of hard to intersect an allowed list of
> > multiple idevs into a single master list.
> >
> > As it is, userspace will have to aggregate the list, sort it, merge
> > adjacent overlapping reserved ranges then invert the list to get an
> > allowed list. This is not entirely simple..
> >
> > Did you already write an algorithm to do this in qemu someplace?
>
> Qemu is optional to aggregate it for S2 given IOMMU_IOAS_IOVA_RANGES
> is still being used. If the only purpose of using this new cmd is to report
> per-device reserved ranges to the guest then aggregation is not required.
>
> Arguably IOMMU_IOAS_IOVA_RANGES becomes redundant with this
> new cmd. But it's already there and as you said it's actually more
> convenient to be used if the user doesn't care about per-device
> reserved ranges...

Yes. it's not simple as userspace needs to do a lot ofwork to get the allowed
iovas if multiple devices are attached to an IOAS.

> > Anyhow, this should be split out from this series. It seems simple
> > enough to merge it now if someone can confirm what qemu needs.

Ok, so the reason is this new ioctl can be used to figure out allowd iovas.
Right?

Regards,
Yi Liu