RE: [PATCH v1 2/8] iommu: Introduce a new iommu_group_replace_domain() API

From: Tian, Kevin
Date: Mon Feb 06 2023 - 01:58:09 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Friday, February 3, 2023 11:03 PM
>
> On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote:
> > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > Sent: Thursday, February 2, 2023 3:05 PM
> > >
> > > All drivers are already required to support changing between active
> > > UNMANAGED domains when using their attach_dev ops.
> >
> > All drivers which don't have *broken* UNMANAGED domain?
>
> No, all drivers.. It has always been used by VFIO.

existing iommu_attach_group() doesn't support changing between
two UNMANAGED domains. only from default->unmanaged or
blocking->unmanaged.

Above statement is true only if this series is based on your unmerged
work removing DMA domain types.

>
> > > +{
> > > + int ret;
> > > +
> > > + if (!new_domain)
> > > + return -EINVAL;
> > > +
> > > + mutex_lock(&group->mutex);
> > > + ret = __iommu_group_set_domain(group, new_domain);
> > > + if (ret) {
> > > + if (__iommu_group_set_domain(group, group->domain))
> > > + __iommu_group_set_core_domain(group);
> > > + }
> >
> > Can you elaborate the error handling here? Ideally if
> > __iommu_group_set_domain() fails then group->domain shouldn't
> > be changed.
>
> That isn't what it implements though. The internal helper leaves
> things in a mess, it is for the caller to fix it, and it depends on
> the caller what that means.

I didn't see any warning of the mess and the caller's responsibility
in __iommu_group_set_domain(). Can it be documented clearly
so if someone wants to add a new caller on it he can clearly know
what to do?

and why doesn't iommu_attach_group() need to do anything
when an attach attempt fails? In the end it calls the same
iommu_group_do_attach_device() as __iommu_group_set_domain()
does.

btw looking at the code __iommu_group_set_domain():

* Note that this is called in error unwind paths, attaching to a
* domain that has already been attached cannot fail.
*/
ret = __iommu_group_for_each_dev(group, new_domain,
iommu_group_do_attach_device);

with that we don't need fall back to core domain in above error
unwinding per this comment.

>
> In this case the API cannot retain a hidden reference to the new
> domain, so it must be purged, one way or another.
>

Could you elaborate where the hidden reference is retained?