Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces

From: Steven Price
Date: Wed Jun 15 2022 - 05:53:42 EST


On 18/04/2022 01:49, Lu Baolu wrote:
> Multiple devices may be placed in the same IOMMU group because they
> cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture.
>
> This adds dma ownership management in iommu core and exposes several
> interfaces for the device drivers and the device userspace assignment
> framework (i.e. VFIO), so that any conflict between user and kernel
> controlled dma could be detected at the beginning.
>
> The device driver oriented interfaces are,
>
> int iommu_device_use_default_domain(struct device *dev);
> void iommu_device_unuse_default_domain(struct device *dev);
>
> By calling iommu_device_use_default_domain(), the device driver tells
> the iommu layer that the device dma is handled through the kernel DMA
> APIs. The iommu layer will manage the IOVA and use the default domain
> for DMA address translation.
>
> The device user-space assignment framework oriented interfaces are,
>
> int iommu_group_claim_dma_owner(struct iommu_group *group,
> void *owner);
> void iommu_group_release_dma_owner(struct iommu_group *group);
> bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>
> The device userspace assignment must be disallowed if the DMA owner
> claiming interface returns failure.
>
> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Signed-off-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> Reviewed-by: Robin Murphy <robin.murphy@xxxxxxx>

I'm seeing a regression that I've bisected to this commit on a Firefly
RK3288 board. The display driver fails to probe properly because
__iommu_attach_group() returns -EBUSY. This causes long hangs and splats
as the display flips timeout.

The call stack to __iommu_attach_group() is:

__iommu_attach_group from iommu_attach_device+0x64/0xb4
iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
commit_tail from drm_atomic_helper_commit+0x164/0x18c
drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1c8
drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224

> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
> {
> int ret;
>
> - if (group->default_domain && group->domain != group->default_domain)
> + if (group->domain && group->domain != group->default_domain)
> return -EBUSY;
>
> ret = __iommu_group_for_each_dev(group, domain,

Reverting this 'fixes' the problem for me. The follow up 0286300e6045
("iommu: iommu_group_claim_dma_owner() must always assign a domain")
doesn't help.

Adding some debug printks I can see that domain is a valid pointer, but
both default_domain and blocking_domain are NULL.

I'm using the DTB from the kernel tree (rk3288-firefly.dtb).

Any ideas?

Thanks,

Steve