RE: [PATCH v7 1/3] iommufd: Add data structure for Intel VT-d stage-1 cache invalidation

From: Tian, Kevin
Date: Tue Nov 21 2023 - 23:58:44 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Tuesday, November 21, 2023 8:17 PM
>
> On Tue, Nov 21, 2023 at 02:54:15AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Tuesday, November 21, 2023 7:05 AM
> > >
> > > On Mon, Nov 20, 2023 at 08:26:31AM +0000, Tian, Kevin wrote:
> > > > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > > > > Sent: Friday, November 17, 2023 9:18 PM
> > > > >
> > > > > This adds the data structure for flushing iotlb for the nested domain
> > > > > allocated with IOMMU_HWPT_DATA_VTD_S1 type.
> > > > >
> > > > > This only supports invalidating IOTLB, but no for device-TLB as device-
> TLB
> > > > > invalidation will be covered automatically in the IOTLB invalidation if
> the
> > > > > underlying IOMMU driver has enabled ATS for the affected device.
> > > >
> > > > "no for device-TLB" is misleading. Here just say that cache invalidation
> > > > request applies to both IOTLB and device TLB (if ATS is enabled ...)
> > >
> > > I think we should forward the ATS invalidation from the guest too?
> > > That is what ARM and AMD will have to do, can we keep them all
> > > consistent?
> > >
> > > I understand Intel keeps track of enough stuff to know what the RIDs
> > > are, but is it necessary to make it different?
> >
> > probably ask the other way. Now intel-iommu driver always flushes
> > iotlb and device tlb together then is it necessary to separate them
> > in uAPI for no good (except doubled syscalls)? :)
>
> I wish I knew more about Intel CC design to be able to answer that :|
>
> Doesn't the VM issue the ATC flush command regardless? How does it
> know it has a working ATC but does not need to flush it?
>
> > anyway this is driver specific contract. I don't see a need to keep
> > it consistent for all.
>
> Given that ARM and AMD need this and would have serious bugs if it
> didn't work this way I'm mildly concerned that Intel will be missing
> something here..
>
> To my mind it seems like this is just a hold over from the prior
> design.
>

As Yi/Baolu discussed there is an issue in intel-iommu driver which
incorrectly skips devtlb invalidation in the guest with the assumption
that the host combines iotlb/devtlb invalidation together. This is
incorrect and should be fixed.

But what I was talking about earlier is about the uAPI between
viommu and iommu driver. I don't see a need of having separate
invalidation cmds for each, as I'm not sure what the user can
expect in the window when iotlb and devtlb are out of sync.

then we just define hwpt 'cache' invalidation in vtd always refers to
both iotlb and devtlb. Then viommu just needs to call invalidation
uapi once when emulating virtual iotlb invalidation descriptor
while emulating the following devtlb invalidation descriptor
as a nop.