Re: [PATCH 1/1] iommu: Avoid races around default domain allocations

From: Nikhil V
Date: Mon Jan 29 2024 - 03:00:01 EST




On 1/18/2024 3:41 PM, Nikhil V wrote:
From: Charan Teja Kalla <quic_charante@xxxxxxxxxxx>

This fix is applicable for 6.1 kernel. In latest kernels, this race
issue is fixed by the patch series [1] and [2]. This fix can be taken
as alternative instead of backporting the series of patches as these
seems too intrusive to be picked for stable branches.
[1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@xxxxxxxxxx/
[2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@xxxxxxxxxx/

A race condition is observed when arm_smmu_device_probe and
modprobe of client devices happens in parallel. This results
in the allocation of a new default domain for the iommu group
even though it was previously allocated and the respective iova
domain(iovad) was initialized. However, for this newly allocated
default domain, iovad will not be initialized. As a result, for
devices requesting dma allocations, this uninitialized iovad will
be used, thereby causing NULL pointer dereference issue.

Flow:
- During arm_smmu_device_probe, bus_iommu_probe() will be called
as part of iommu_device_register(). This results in the device probe,
__iommu_probe_device().

- When the modprobe of the client device happens in parallel, it
sets up the DMA configuration for the device using of_dma_configure_id(),
which inturn calls iommu_probe_device(). Later, default domain is
allocated and attached using iommu_alloc_default_domain() and
__iommu_attach_device() respectively. It then ends up initializing a
mapping domain(IOVA domain) and rcaches for the device via
arch_setup_dma_ops()->iommu_setup_dma_ops().

- Now, in the bus_iommu_probe() path, it again tries to allocate
a default domain via probe_alloc_default_domain(). This results in
allocating a new default domain(along with IOVA domain) via
__iommu_domain_alloc(). However, this newly allocated IOVA domain
will not be initialized.

- Now, when the same client device tries dma allocations via
iommu_dma_alloc(), it ends up accessing the rcaches of the newly
allocated IOVA domain, which is not initialized. This results
into NULL pointer dereferencing.

Fix this issue by adding a check in iommu_group_alloc_default_domain()
to see if the iommu_group already has a default domain allocated
and initialized.

Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx>
Co-developed-by: Nikhil V <quic_nprakash@xxxxxxxxxxx>
Signed-off-by: Nikhil V <quic_nprakash@xxxxxxxxxxx>
---
drivers/iommu/iommu.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b3897239477..99f8cd5af497 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1594,6 +1594,9 @@ static int iommu_group_alloc_default_domain(struct bus_type *bus,
{
struct iommu_domain *dom;
+ if (group->default_domain)
+ return 0;
+
dom = __iommu_domain_alloc(bus, type);
if (!dom && type != IOMMU_DOMAIN_DMA) {
dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);

Hi,

Gentle ping to have your valuable feedback. This fix is helping us downstream without which we see a bunch of kernel crashes.

Thanks
Nikhil V