Race between of_iommu_configure() and iommu_probe_device()

From: Hector Martin
Date: Fri Nov 03 2023 - 07:56:10 EST


I just hit a crash in of_iommu_xlate() -> apple_dart_of_xlate() because
dev->iommu was NULL. of_iommu_xlate() first calls iommu_fwspec_init
which calls dev_iommu_get(), which allocates that member if NULL. That
means it got freed in between, but the only thing that can do that is
dev_iommu_free(), which is called from __iommu_probe_device() in the
error path. That is serialized via a static lock, but not against the
xlate stuff.

I think the specific sequence of events was as follows:

- IOMMU driver has not probed yet
- Device driver tries to probe, and gets deferred via of_iommu_xlate()
-> driver_deferred_probe_check_state() because there are no IOMMU ops yet
- IOMMU driver probes
- IOMMU driver registration triggers device probes
- IOMMU device probe fails, because there is no fwnode/OF data yet (e.g.
apple_dart_probe_device returns ENODEV if dev_iommu_priv_get() returns
NULL, and that is set in apple_dart_of_xlate())
- __iommu_probe_device is in the error exit path, and at this exact
point a parallel device probe is running of_iommu_xlate()
- of_iommu_xlate() calls iommu_fwspec_init(), which ensures dev->iommu
is non-NULL, which at this point it is
- immediately after that, __iommu_probe_device() calls dev_iommu_free()
since it is in the process of erroring out. This frees and sets
dev->iommu to NULL.
- of_iommu_xlate() calls ops->of_xlate()
- apple_dart_of_xlate() calls dev_iommu_priv_set(), which crashes
because dev->iommu is now NULL.

As far as I can tell it's not just the specific driver xlate call
setting priv that's the problem here, but there is one big race between
the entire fwspec codepath (accessing dev->iommu->fwspec) and
__iommu_probe_device() (allocating and freeing dev->iommu).

Thinking about this whole thing is making my brain hurt. Thoughts? How
do we fix this?

Splat:
apple-dart 228304000.iommu: DART [pagesize 4000, 16 streams, bypass
support: 0, bypass forced: 0, locked: 0, AS 32 -> 36] initialized
apple-dart 231304000.iommu: DART [pagesize 4000, 16 streams, bypass
support: 1, bypass forced: 0, locked: 1, AS 32 -> 36] initialized
apple-dart 23130c000.iommu: DART [pagesize 4000, 16 streams, bypass
support: 1, bypass forced: 0, locked: 1, AS 32 -> 36] initialized
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000040
fbcon: Taking over console
apple-dart 22c0e8000.iommu: DART [pagesize 4000, 16 streams, bypass
support: 1, bypass forced: 0, locked: 0, AS 32 -> 36] initialized
Mem abort info:
ESR = 0x0000000096000044
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 16k pages, 48-bit VAs, pgdp=000000081d900d50
[0000000000000040] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000044 [#1] SMP
Modules linked in: i2c_apple apple_dart(+) adpdrm drm_dma_helper sunrpc
vfat fat nvme_apple apple_sart nvme_core nvme_common scsi_dh_rdac
scsi_dh_emc scsi_dh_alua fuse dm_multipath
CPU: 2 PID: 12 Comm: kworker/u16:1 Tainted: G S
6.5.6-403.asahi.fc39.aarch64+16k #1
Hardware name: Apple MacBook Pro (13-inch, M1, 2020) (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
apple-dart 22c0f4000.iommu: DART [pagesize 4000, 16 streams, bypass
support: 0, bypass forced: 0, locked: 0, AS 32 -> 36] initialized
pc : apple_dart_of_xlate+0x54/0x1f8 [apple_dart]
lr : apple_dart_of_xlate+0x154/0x1f8 [apple_dart]
sp : ffff8000800e3a10
x29: ffff8000800e3a10 x28: 0000000000000000 x27: 0000000000000000
x26: ffffcfcf9c75d9c0 x25: 0000000000000000 x24: ffffcfcf9b54c798
x23: ffffcfcf9b54d568 x22: ffff1aea0a88dc10 x21: ffff1aea22738080
x20: 0000000000000000 x19: ffff1aea0af34000 x18: ffffffffffffffff
x17: ffff4b1e41cac000 x16: ffffcfcf99eabeb0 x15: ffff8000800e3810
x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000003
x11: 0101010101010101 x10: 000000000011cdb8 x9 : 0000000000000000
x8 : ffff1aea0af34080 x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : ffff8000800e3960 x3 : 0000000000000000
x2 : 0000000000000000 x1 : ffff1aea0a30ee00 x0 : 0000000000000000
Call trace:
apple_dart_of_xlate+0x54/0x1f8 [apple_dart]
of_iommu_xlate+0xa4/0xe8
of_iommu_configure+0x190/0x1f8
of_dma_configure_id+0x13c/0x348
platform_dma_configure+0x38/0xd0
really_probe+0x7c/0x3d8
__driver_probe_device+0x84/0x180
driver_probe_device+0x44/0x120
__device_attach_driver+0xc4/0x168
bus_for_each_drv+0x90/0xf8
__device_attach+0xa8/0x1c8
device_initial_probe+0x1c/0x30
bus_probe_device+0xb4/0xc0
deferred_probe_work_func+0xbc/0x118
process_one_work+0x1f4/0x4a0
worker_thread+0x74/0x418
kthread+0xf4/0x108
ret_from_fork+0x10/0x20
Code: 540003a1 b9400e94 b40007b3 f94172c0 (f9002013)
---[ end trace 0000000000000000 ]---

- Hector