Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)

From: Jason Gunthorpe
Date: Mon Dec 11 2023 - 08:20:53 EST


On Mon, Dec 11, 2023 at 08:35:09PM +0800, Yi Liu wrote:

> > What iommufd object should receive the IOTLB invalidation command list:
> > Intel: The Nesting domain. The command list has to be broken up per
> > (vDomain-ID,PASID) and that batch delivered to the single
> > nesting domain. Kernel ignores vDomain-ID/PASID and just
> > invalidates whatever the nesting domain is actually attached to
>
> this is what we are doing in current series[1]. is it?

I think so

> > So.. In short.. Invalidation is a PITA. The idea is the same but
> > annoying little details interfere with actually having a compltely
> > common API here. IMHO the uAPI in this series is fine. It will support
> > Intel invalidation and non-ATC invalidation on AMD/ARM. It should be
> > setup to allow that the target domain object can be any HWPT.
>
> This HWPT is still nested domain. Is it? But it can represent a guest I/O
> page table (VT-d), guest CD table (ARM), guest CR3 Table (AMD, it seems to
> be a set of guest CR3 table pointers). May ARM and AMD guys keep me honest
> here.

I was thinking ARM would not want to use a nested domain because
really the invalidation is global to the entire nesting parent.

But, there is an issue with that - the nesting parent could be
attached to multiple iommu instances but we only want to invalidate a
single instance.

However, specifying the nesting domain instead, and restricting the
nesting domain to be single-instance would provide the kernel enough
information without providing weird new APIs. So maybe it is best to
just make everything use the nesting domain even if it is somewhat
semantically weird on arm.

> The Intel guest I/O page table case may be the simplest as userspace only
> needs to provide the HWPT ID and the affected ranges for invalidating. As
> mentioned above, kernel will find out the attached device/pasid and
> invalidating cache with the device/pasid. For ARM and AMD case, extra
> information is needed. Am I getting you correct?

Yes

> > Thus next steps:
> > - Respin this and lets focus on Intel only (this will be tough for
> > the holidays, but if it is available I will try)
>
> I've respinned the iommufd cache invalidation part with the change to
> report error_code/error_data per invalidation entry. yet still busy on
> making Intel VTd part to report the error_code. Besides, I didn't see
> other respin needed for Intel VT-d invalidation. If I missed thing, please
> do let me know.:)

Lets see

Jason