Re: [PATCH 1/1] iommufd/selftest: Use right iommu_ops for mock device

From: Robin Murphy
Date: Tue Jan 16 2024 - 13:20:23 EST


On 11/01/2024 3:56 pm, Jason Gunthorpe wrote:
On Thu, Jan 11, 2024 at 03:50:51PM +0000, Robin Murphy wrote:
On 11/01/2024 2:48 pm, Jason Gunthorpe wrote:
On Thu, Jan 11, 2024 at 03:32:13PM +0800, Lu Baolu wrote:
In the iommu probe device path, __iommu_probe_device() gets the iommu_ops
for the device from dev->iommu->fwspec if this field has been initialized
before probing. Otherwise, it will lookup the global iommu device list
and use the iommu_ops of the first iommu device which has no
dev->iommu->fwspec. This causes the wrong iommu_ops to be used for the mock
device on x86 platforms where dev->iommu->fwspec is not used.

Preallocate the fwspec for the mock device so that the right iommu ops can
be used.

I really don't like this.

The lifecycle model for fwspec is already a bit confusing. Introducing
a new case where a driver pre-allocates the fwspec is making it worse,
not better.

eg iommu_init_device() error unwind will free this allocated fwspec
leaving the device broken. We don't have the concept of a fwspec that
is owned by the device, it is really owned by the probing code.

As I've tried to explain before, this is in fact the correct use of fwspec
as originally designed, i.e. being set up by *bus code* before device_add()
(remember this is not the "IOMMU driver" part of selftest.c).

I understand it was the intention, but it doesn't relaly match how the
code works today..

The fact that some things aren't following the pattern, and are broken and problematic in several ways as a result, does not mean that other things that *can* follow the pattern correctly shouldn't.

Indeed for perfect symmetry the bus code would free the fwspec after the
corresponding device_del() returns, but there's no harm in that being
factored into iommu_release_device() since the notifier call occurs
sufficiently late in device_del() itself as to make no practical difference.

IIRC there were issues with leaking the dev_iommu :(

AFAICS there was only an issue introduced last year when some unrelated stuff added an erroneous early return to iommu_release_device() if no group was assigned, thus subtly broke the existing code (and it did end up getting fixed in a roundabout manner a couple of months later).

I'm working to get things back to that model (wherein the dev_iommu and
fwspec lifecycles become trivial), just with the slight tweak that these
days it's going to make more sense to have the initialisation factored into
device_add() itself (via iommu_probe_device()), rather than beforehand.

I would prefer to simply remove fwspec as I've already shown patches
for. You should give some comment on them.

You mean the 1600 lines of churn which did nothing to address any real problem (but did at least acknowledge so in the cover letter)? I thought I had responded to that, but it must have been one of the many drafts which end up getting deleted out of utter exasperation. Needless to say, the response was a NAK. For the last time, any fwspec lifetime issues are a *symptom* of a well-understood problem which exists, and not a problem in themselves. Yes, due to the evolution of the API there is also now some stuff being carried around in iommu_fwspec that really shouldn't need to be, but once probing is properly fixed it will get stripped back down to the useful shared abstraction of stored firmware data that has always been its true spirit. In the meantime, adding a load more complexity to unabstract it and support 2 or 3 different ways of drivers all individually open-coding storage of the same data is not helpful now, and even less helpful in future.

My main complaint is there is no full vision to remove the 'global
drivers', we will always have some drivers doing FW parsing in probe
and then this different fwspec thing on the side for other drivers.

Honestly I would love to see the DMAR/IVRS parsing decoupled a bit more from the Intel/AMD drivers, not least in the hope that it might allow cleaner separation of the IRQ remapping drivers from the IOMMU API drivers. However I don't have my hopes up since in practice it's probably a non-trivial amount of work with no real functional benefit in the end, and it's certainly not something I'd ever have the time or inclination to attempt myself. The SoC drivers doing their own weird things to parse DT bindings will get cleaned up once arch/arm understands groups, and that *is* all on my to-do list (and as for the arm-smmu legacy binding, if it still gets in the way at all by that point I'll be inclined to call it obsolete and drop support).

Thanks,
Robin.