RE: [PATCH v7 1/3] iommufd: Add data structure for Intel VT-d stage-1 cache invalidation

From: Tian, Kevin
Date: Thu Dec 14 2023 - 22:04:59 EST


> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Friday, December 15, 2023 10:28 AM
>
> On Fri, Dec 15, 2023 at 01:50:07AM +0000, Tian, Kevin wrote:
> > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > > Sent: Thursday, December 14, 2023 7:27 PM
> > >
> > > On 2023/11/17 21:18, Yi Liu wrote:> This adds the data structure for
> > > flushing iotlb for the nested domain
> > >
> > > +struct iommu_hwpt_vtd_s1_invalidate {
> > > + __aligned_u64 addr;
> > > + __aligned_u64 npages;
> > > + __u32 flags;
> > > + __u32 __reserved;
> > > + __u32 error;
> > > + __u32 dev_id;
> > > +};
> > >
> > > dev_id is used to report the failed device, userspace should be able to
> map
> > > it to a vRID, and inject it to VM as part of ITE/ICE error.
> > >
> > > However, I got a problem when trying to get dev_id in cache invalidation
> > > path, since this is filled in intel iommu driver. Seems like there is no
> > > good way for it. I've below alternatives to move forward, wish you have
> > > a look.
>
> > >
> > > - Reuse Nicolin's vRID->pRID mapping. If thevRID->pRID mapping is
> > > maintained, then intel iommu can report a vRID back to user. But intel
> > > iommu driver does not have viommu context, no place to hold the vRID-
> > > >pRID
> > > mapping. TBH. It may require other reasons to introduce it other than the
> > > error reporting need. Anyhow, this requires more thinking and also has
> > > dependency even if it is doable in intel side.
> >
> > this sounds like a cleaner way to inject knowledge which iommu driver
> > requires to find out the user tag. but yes it's a bit weird to introduce
> > viommu awareness in intel iommu driver when there is no such thing
> > in real hardware.
>
> I think a viommu is defined more like a software object representing
> the virtual IOMMU in a VM. Since VT-d has a vIOMMU in a nesting case,
> there could be an object for it too?

for VT-d it's not necessary to maintain such vIOMMU awareness in
the kernel (before this error reporting case) given its interfaces are
simply around hwpt's. there is no vIOMMU-scope operation provided
by intel-iommu driver so far.

>
> > and for this error reporting case what we actually require is the
> > reverse map i.e. pRID->vRID. Not sure whether we can leverage the
> > same RID mapping uAPI as for ARM/AMD but ignore viommu_id
> > and then store vRID under device_domain_info. a bit tricky on
> > life cycle management and also incompatible with SIOV...
>
> One thing that I am not very clear here: since both vRID and dev_id
> are given by the VMM, shouldn't it already know the mapping if the
> point is to translate (pRID->)dev_id->vRID?
>

it's true for current Qemu.

but there is plan to support Qemu accepting a fd passed by Libvirt.
In that case Qemu even doesn't see the sysfs path hence is not
aware of pRID. otherwise yes we could leave the translation to
VMM instead.