Re: [PATCH v5 5/6] iommu: Support mm PASID 1:n with sva domains

From: Nicolin Chen
Date: Fri Oct 06 2023 - 04:08:04 EST


Hi Tina,

On Sun, Sep 24, 2023 at 07:38:12PM -0700, Tina Zhang wrote:

> Each mm bound to devices gets a PASID and corresponding sva domains
> allocated in iommu_sva_bind_device(), which are referenced by iommu_mm
> field of the mm. The PASID is released in __mmdrop(), while a sva domain
> is released when no one is using it (the reference count is decremented
> in iommu_sva_unbind_device()). However, although sva domains and their
> PASID are separate objects such that their own life cycles could be
> handled independently, an enqcmd use case may require releasing the
> PASID in releasing the mm (i.e., once a PASID is allocated for a mm, it
> will be permanently used by the mm and won't be released until the end
> of mm) and only allows to drop the PASID after the sva domains are
> released. To this end, mmgrab() is called in iommu_sva_domain_alloc() to
> increment the mm reference count and mmdrop() is invoked in
> iommu_domain_free() to decrement the mm reference count.
>
> Since the required info of PASID and sva domains is kept in struct
> iommu_mm_data of a mm, use mm->iommu_mm field instead of the old pasid
> field in mm struct. The sva domain list is protected by iommu_sva_lock.
>
> Besides, this patch removes mm_pasid_init(), as with the introduced
> iommu_mm structure, initializing mm pasid in mm_init() is unnecessary.
>
> Reviewed-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> Reviewed-by: Vasant Hegde <vasant.hegde@xxxxxxx>
> Signed-off-by: Tina Zhang <tina.zhang@xxxxxxxxx>

> @@ -128,8 +142,9 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
> struct device *dev = handle->dev;
>
> mutex_lock(&iommu_sva_lock);
> + iommu_detach_device_pasid(domain, dev, pasid);
> if (--domain->users == 0) {
> - iommu_detach_device_pasid(domain, dev, pasid);
> + list_del(&domain->next);
> iommu_domain_free(domain);
> }
> mutex_unlock(&iommu_sva_lock);
> @@ -209,4 +224,5 @@ void mm_pasid_drop(struct mm_struct *mm)
> return;
>
> iommu_free_global_pasid(mm_get_pasid(mm));
> + kfree(mm->iommu_mm);

I ran some SVA tests by applying this series on top of my local
SMMUv3 tree, though it is not exactly a vanilla mainline tree.
And I see a WARN_ON introduced by this patch (did git-bisect):

[ 364.237319] ------------[ cut here ]------------
[ 364.237328] ida_free called for id=12 which is not allocated.
[ 364.237346] WARNING: CPU: 2 PID: 11003 at lib/idr.c:525 ida_free+0x10c/0x1d0
....
[ 364.237415] pc : ida_free+0x10c/0x1d0
[ 364.237416] lr : ida_free+0x10c/0x1d0
....
[ 364.237439] Call trace:
[ 364.237440] ida_free+0x10c/0x1d0
[ 364.237442] iommu_free_global_pasid+0x30/0x50
[ 364.237449] mm_pasid_drop+0x44/0x70
[ 364.237452] __mmdrop+0xf4/0x210
[ 364.237457] finish_task_switch.isra.0+0x238/0x2e8
[ 364.237460] schedule_tail+0x1c/0x1b8
[ 364.237462] ret_from_fork+0x4/0x20
[ 364.237466] irq event stamp: 0
[ 364.237467] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 364.237470] hardirqs last disabled at (0): [<ffffc0c16022e558>] copy_process+0x770/0x1c78
[ 364.237473] softirqs last enabled at (0): [<ffffc0c16022e558>] copy_process+0x770/0x1c78
[ 364.237475] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 364.237476] ---[ end trace 0000000000000000 ]---

I haven't traced it closely to see what's wrong, due to some other
tasks. Yet, if you have some idea about this or something that you
want me to try, let me know.

Thanks
Nicolin