Re: [PATCH 1/1] iommufd/selftest: Use right iommu_ops for mock device

From: Robin Murphy
Date: Thu Jan 11 2024 - 10:51:21 EST


On 11/01/2024 2:48 pm, Jason Gunthorpe wrote:
On Thu, Jan 11, 2024 at 03:32:13PM +0800, Lu Baolu wrote:
In the iommu probe device path, __iommu_probe_device() gets the iommu_ops
for the device from dev->iommu->fwspec if this field has been initialized
before probing. Otherwise, it will lookup the global iommu device list
and use the iommu_ops of the first iommu device which has no
dev->iommu->fwspec. This causes the wrong iommu_ops to be used for the mock
device on x86 platforms where dev->iommu->fwspec is not used.

Preallocate the fwspec for the mock device so that the right iommu ops can
be used.

I really don't like this.

The lifecycle model for fwspec is already a bit confusing. Introducing
a new case where a driver pre-allocates the fwspec is making it worse,
not better.

eg iommu_init_device() error unwind will free this allocated fwspec
leaving the device broken. We don't have the concept of a fwspec that
is owned by the device, it is really owned by the probing code.

As I've tried to explain before, this is in fact the correct use of fwspec as originally designed, i.e. being set up by *bus code* before device_add() (remember this is not the "IOMMU driver" part of selftest.c).

Indeed for perfect symmetry the bus code would free the fwspec after the corresponding device_del() returns, but there's no harm in that being factored into iommu_release_device() since the notifier call occurs sufficiently late in device_del() itself as to make no practical difference.

I'm working to get things back to that model (wherein the dev_iommu and fwspec lifecycles become trivial), just with the slight tweak that these days it's going to make more sense to have the initialisation factored into device_add() itself (via iommu_probe_device()), rather than beforehand.

Thanks,
Robin.

The fundamental issue is we now have a special kind of driver:

fwspec = dev_iommu_fwspec_get(dev);
if (fwspec && fwspec->ops)
ops = fwspec->ops;
else
ops = iommu_ops_from_fwnode(NULL);
^^^^^^^^

Which represents a "global" non-fwspec using driver that will only
bind to devices that didn't parse into a fwspec.

The code above supports only one of these drivers at time, but allows
more than one to be registered - it is inconsistent.

I think the right/easy answer is to iterate over all the "global"
drivers and call their probe instead of just the first one.

Especially since my approach over here migrates the whole thing to work
by iterating:

https://lore.kernel.org/all/0-v2-f82a05539a64+5109-iommu_fwspec_p2_jgg@xxxxxxxxxx/

And this patch:

https://lore.kernel.org/all/28-v2-f82a05539a64+5109-iommu_fwspec_p2_jgg@xxxxxxxxxx/

Is how I made the iterating logic, it could be pulled out and tidied a
bit.

Jason