On Wed, 7 Jun 2023 12:07:52 -0700
Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx> wrote:
From vfio_iommu_type1_release() there is a code path:
vfio_iommu_unmap_unpin_all()
vfio_remove_dma()
vfio_unmap_unpin()
unmap_unpin_slow()
vfio_unpin_pages_remote()
vfio_find_vpfn()
This path is taken without acquiring the iommu lock so it could lead to
a race condition in the traversal of the pfn_list rb tree.
What's the competing thread for the race, vfio_remove_dma() tests:
WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list));
The fix is not unreasonable, but is this a theoretical fix upstream
that's tickled by some downstream additions, or are we actually
competing against page pinning by an mdev driver after the container is
released? Thanks,
Alex
The lack of
the iommu lock in vfio_iommu_type1_release() was confirmed by adding a
WARN_ON(!mutex_is_locked(&iommu->lock))
which was reported in dmesg. Fix this potential race by adding a iommu
lock and release in vfio_iommu_type1_release().
Suggested-by: Khalid Aziz <khalid.aziz@xxxxxxxxxx>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
---
drivers/vfio/vfio_iommu_type1.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 306e6f1d1c70e..7d2fea1b483dc 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2601,7 +2601,9 @@ static void vfio_iommu_type1_release(void *iommu_data)
kfree(group);
}
+ mutex_lock(&iommu->lock);
vfio_iommu_unmap_unpin_all(iommu);
+ mutex_unlock(&iommu->lock);
list_for_each_entry_safe(domain, domain_tmp,
&iommu->domain_list, next) {