Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)

From: Jason Gunthorpe
Date: Mon Dec 11 2023 - 16:57:47 EST


On Mon, Dec 11, 2023 at 01:27:49PM -0800, Nicolin Chen wrote:
> On Fri, Dec 08, 2023 at 09:47:26PM -0400, Jason Gunthorpe wrote:
> > What is in a Nested domain:
> > ARM: A CD table pointer
> > Nesting domains are created for every unique CD table top pointer.
>
> I think we basically implemented in a way of syncing STE, i,e,
> vSTE.Config must be "S1 Translate" besides a CD table pointer,
> and a nested domain is freed when vSTE.Config=BYPASS even if a
> CD table pointer is present, right?

Yes, but you can also de-duplicate the nested domains based on the CD
table pointer. It is not as critical for ARM as others, but may
still be worth doing.

> > To make this work the iommu needs to be programmed with:
> > AMD: A vDomain-ID -> pDomain-ID table
> > A vRID -> pRID table
> > This is all bound to some "virtual function"
> > ARM: A vRID -> pRID table
> > The vCMDQ is bound to a VM_ID, so to the Nesting Parent
>
> VCMDQ also has something called "virtual interface" that holds
> a VMID and a list of CMDQ queues, which might be a bit similar
> to AMD's "virtual function".

Yeah, there must be some kind of logical grouping of HW objects to
build that kind of stuff.

> > The vRID->pRID table should be some mostly common
> > IOMMUFD_DEV_ASSIGN_VIRTUAL_ID. AMD will need to pass in the virtual
> > function ID and ARM will need to pass in the Nesting Parent ID.
>
> It sounds like our previous IOMMUFD_SET/UNSET_IDEV_DATA. I'm
> wondering if we need to make it exclusive to the ID assigning?
> Maybe set_idev_data could be reused for other potential cases?

No, it should be an API only for the ID

> If we do implement an IOMMUFD_DEV_ASSIGN_VIRTUAL_ID, do we need
> an IOMMUFD_DEV_RESIGN_VIRTUAL_ID? (or better word than resign).

I don't think so.. The vRID is basically fixed, if it needs to be
changed then the device can be destroyed (or assign can just change it)

> Could the structure just look like this?
> struct iommu_dev_assign_virtual_id {
> __u32 size;
> __u32 dev_id;
> __u32 id_type;
> __u32 id;
> };

It needs to take in the viommu_id also, and I'd make the id 64 bits
just for good luck.

> > In many ways the nesting parent/virtual function are very similar
> > things. Perhaps ARM should also create a virtual function object which
> > is just welded to the nesting parent for API consistency.
>
> A virtual function that holds an S2 domain/iopt + a VMID? If
> this is for VCMDQ, the VMCDQ extension driver has that kinda
> object holding an S2 domain: I implemented as the extension
> function at the end of arm_smmu_finalise_s2() previously.

Not so much hold a S2, but that the VMID would be forced to be shared
amung them somehow.

> > IOMMUFD_DEV_INVALIDATE should be introduced with the same design as
> > HWPT invalidate. This would be used for AMD/ARM's ATC invalidation
> > (and just force the stream ID, userspace must direct the vRID to the
> > correct dev_id).
>
> SMMU's CD invalidations could fall into this category too.

Yes, I forgot to look closely at the CD/GCR3 table invalidations :(
I actually can't tell how AMD invalidates any GCR3 cache, maybe
INVALIDATE_DEVTAB_ENTRY?

> > Then in yet another series we can tackle the entire "virtual function"
> > vRID/pRID translation stuff when the mmapable queue thing is
> > introduced.
>
> VCMDQ is also a mmapable queue. I feel that there could be
> more common stuff between "virtual function" and "virtual
> interface", I'll need to take a look at AMD's stuff though.

I'm not thinking of two things right now at least..

> I previously drafted something to test it out with iommufd.
> Basically it needs the pairing of vRID/pRID in attach_dev()
> and another ioctl to mmap/config user queue(s):
> +struct iommu_hwpt_cache_config_tegra241_vcmdq {
> + __u32 vcmdq_id; // queue id
> + __u32 vcmdq_log2size; // queue size
> + __aligned_u64 vcmdq_base; // queue guest PA
> +};

vRID/pRID pairing should come from IOMMUFD_DEV_ASSIGN_VIRTUAL_ID. When
a HWPT is allocated it would be connected to the viommu_id and then it
would all be bundled together in the HW somehow

>From there you can ask the viommu_id to setup a queue.

Jason