Re: arm64 iommu groups issue

From: John Garry
Date: Mon Jun 15 2020 - 03:37:12 EST


On 12/06/2020 15:30, Lorenzo Pieralisi wrote:
On Mon, Feb 17, 2020 at 12:08:48PM +0000, John Garry wrote:

Right, and even worse is that it relies on the port driver even
existing at all.

All this iommu group assignment should be taken outside device
driver probe paths.

However we could still consider device links for sync'ing the SMMU
and each device probing.

Yes, we should get that for DT now thanks to the of_devlink stuff, but
cooking up some equivalent for IORT might be worthwhile.

It doesn't solve this problem, but at least we could remove the iommu_ops
check in iort_iommu_xlate().

We would need to carve out a path from pci_device_add() or even device_add()
to solve all cases.


Another thought that crosses my mind is that when pci_device_group()
walks up to the point of ACS isolation and doesn't find an existing
group, it can still infer that everything it walked past *should* be put
in the same group it's then eventually going to return. Unfortunately I
can't see an obvious way for it to act on that knowledge, though, since
recursive iommu_probe_device() is unlikely to end well.


[...]

And this looks to be the reason for which current
iommu_bus_init()->bus_for_each_device(..., add_iommu_group) fails
also.

Of course, just adding a 'correct' add_device replay without the
of_xlate process doesn't help at all. No wonder this looked suspiciously
simpler than where the first idea left off...

(on reflection, the core of this idea seems to be recycling the existing
iommu_bus_init walk rather than building up a separate "waiting list",
while forgetting that that wasn't the difficult part of the original
idea anyway)

We could still use a bus walk to add the group per iommu, but we would need
an additional check to ensure the device is associated with the IOMMU.


On this current code mentioned, the principle of this seems wrong to
me - we call bus_for_each_device(..., add_iommu_group) for the first
SMMU in the system which probes, but we attempt to add_iommu_group()
for all devices on the bus, even though the SMMU for that device may
yet to have probed.

Yes, iommu_bus_init() is one of the places still holding a
deeply-ingrained assumption that the ops go live for all IOMMU instances
at once, which is what warranted the further replay in
of_iommu_configure() originally. Moving that out of
of_platform_device_create() to support probe deferral is where the
trouble really started.

I'm not too familiar with the history here, but could this be reverted now
with the introduction of of_devlink stuff?

Hi John,

Hi Lorenzo,


have we managed to reach a consensus on this thread on how to solve
the issue ?

No, not really. So Robin and I tried a couple of quick things previously, but they came did not come to much, as above.

Asking because this thread seems stalled - I am keen on
getting it fixed.

I haven't spent more time on this. But from what I was hearing last time, this issue was ticketed internally for arm, so I was waiting for that to be picked up to re-engage.

Thanks,
John