Re: [PATCH v2 1/3] iommufd: Add devices_users to track the hw_pagetable usage by device

From: Nicolin Chen
Date: Thu Feb 02 2023 - 02:28:20 EST


On Wed, Feb 01, 2023 at 01:18:08PM -0800, Nicolin Chen wrote:
> On Wed, Feb 01, 2023 at 04:00:40PM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 01, 2023 at 11:25:10AM -0800, Nicolin Chen wrote:
> >
> > > The "finalise" is one of the very first lines of the attach_dev()
> > > callback function in SMMU driver, though it might still undesirably
> > > fail the replace().
> >
> > It won't get that far.
> >
> > Remember how this all works - only autodomains have the special path
> > that allocates a domain, attaches the empty domain, and then populates
> > it with the ioas. We made this special path specifically to accomodate
> > the current ARM drivers, otherwise they wouldn't work at all.
>
> Yes.
>
> > replace can't do this - replace must always start out with a
> > pre-existing hwpt that was allocated with a dedicated hwpt allocation
> > ioctl.
> >
> > Wwhen the hwpt was allocated it must be linked to the IOAS at that
> > time, because we definately don't do defered IOAS linkage.
> >
> > So on ARM when you create an unfinalizes iommu_domain it cannot be
> > added to the IOAS (because it has an empty aperture) and creation will
> > fail, or worse, it will get added to an empty IOAS and make the IOAS
> > permanently unusable.
>
> IIUIC, user space might add some IOVA mappings to the hwpt/iopt,
> before doing a replace(). If we do a deferred IOAS linkage to
> this hwpt/iopt, it will cause a problem because we are adding
> the reserved region for the MSI window later than IOMMU_IOAS_MAP
> calls. Thus, "we definately don't do defered IOAS linkage".
>
> With this justification, I think I should include my patch of
> moving iopt_table_add/remove_domain(), in the replace series.

I just posted the replace series. But I found that the base
line (without the nesting series) allocates iommu_domains
always with the ->domain_alloc() op. So, we cannot actually
move the iopt_table_add_domain() in the replace series, as I
intended to.

Yet, a great news is that our nesting series replaces the
domain_alloc() op entirely with ->domain_alloc_user() for all
the iommu_domain allocations, including for auto_domains. So,
we can completely move iopt_table_add_domain() to the hwpt
allocation function. And we don't really need a big change
in the SMMU driver nor Robin's patch that passes in dev ptr
to domain_alloc() op. And even this device_users refcount in
this patch is no longer needed. It also simplifies the shared
device locking situation, if I am not missing anything.

So, in short, we'll have to wait for ->domain_alloc_user()
patch (in the nesting series) to unblock the problem that we
discussed above regarding the iopt_table_add_domain().

Thanks
Nic