Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

From: Jason Gunthorpe
Date: Tue Jan 16 2024 - 21:20:40 EST


On Wed, Jan 17, 2024 at 01:57:34AM +0000, Robin Murphy wrote:
> > > ---
> > > drivers/iommu/intel/iommu.c | 10 ++++++----
> > > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > Unfortunately this issue is likely systemic in all the drivers.
>
> All two of the drivers which support deferred attach, that is.

We could call release_device on an unlikely error unwind path with a
NULL domain for all drivers.

> > Something I've been thinking about for a while now is to have the
> > option for the core code release to set the driver to a specific
> > releasing domain (probably a blocking or identity global static)
> >
> > A lot of drivers are open coding this internally in their release
> > functions, like Intel. But it really doesn't deserve to be special.
>
> Either way it shouldn't apply in this case, though. Crash kernels *are*
> special. The whole point of deferred attach is that we don't disturb
> anything unless we've got as far as definitely using a given default domain
> (from which we can infer the relevant client device should have been reset
> and quiesced).

If only it were so simple.. It assumes drivers are structured to reset
their devices using only MMIO before doing anything to setup DMA and
trigger an iommu attach. A lot of drivers aren't and this whole scheme
turns a bit sketchy.

Some devices can't even do it at all. eg mlx5 has the crashing kernel
quiet the device before passing over to the crash kernel because there
is no way beyond FLR to regain control of a mlx5 device.

> Thus regardless of why release might get called, if a
> deferred attach never happened then the release should still avoid touching
> any hardware configuration either.

Interesting point.. Makes sense to me

> I'd suggest using dev->iommu->attach_deferred as the condition rather than a
> NULL domain, to be clear that it's the *only* valid reason to not have a
> domain at this point.

So if we take this release_domain approach and we have the core code
not call it for the defered attach case then the driver's
release_device should just free memory allocated by probe and not
touch the data structures?

Then the idea is that drivers would only manipulate their STE/DTE/etc
under attach and never in probe/release.

Jason