Re: [PATCH v9 12/14] iommu: Use refcount for fault data access

From: Baolu Lu
Date: Mon Jan 08 2024 - 21:53:10 EST


On 1/6/24 12:09 AM, Jason Gunthorpe wrote:
On Wed, Dec 20, 2023 at 09:23:30AM +0800, Lu Baolu wrote:
The per-device fault data structure stores information about faults
occurring on a device. Its lifetime spans from IOPF enablement to
disablement. Multiple paths, including IOPF reporting, handling, and
responding, may access it concurrently.

Previously, a mutex protected the fault data from use after free. But
this is not performance friendly due to the critical nature of IOPF
handling paths.

Refine this with a refcount-based approach. The fault data pointer is
obtained within an RCU read region with a refcount. The fault data
pointer is returned for usage only when the pointer is valid and a
refcount is successfully obtained. The fault data is freed with
kfree_rcu(), ensuring data is only freed after all RCU critical regions
complete.

An iopf handling work starts once an iopf group is created. The handling
work continues until iommu_page_response() is called to respond to the
iopf and the iopf group is freed. During this time, the device fault
parameter should always be available. Add a pointer to the device fault
parameter in the iopf_group structure and hold the reference until the
iopf_group is freed.

Make iommu_page_response() static as it is only used in io-pgfault.c.

Co-developed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
Tested-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
---
include/linux/iommu.h | 17 +++--
drivers/iommu/io-pgfault.c | 129 +++++++++++++++++++++++--------------
drivers/iommu/iommu-sva.c | 2 +-
3 files changed, 90 insertions(+), 58 deletions(-)

This looks basically Ok

+/* Caller must hold a reference of the fault parameter. */
+static void iopf_put_dev_fault_param(struct iommu_fault_param *fault_param)
+{
+ if (refcount_dec_and_test(&fault_param->users))
+ kfree_rcu(fault_param, rcu);
+}

[..]

@@ -402,10 +429,12 @@ int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
INIT_LIST_HEAD(&fault_param->faults);
INIT_LIST_HEAD(&fault_param->partial);
fault_param->dev = dev;
+ refcount_set(&fault_param->users, 1);
+ init_rcu_head(&fault_param->rcu);

No need to do init_rcu_head() when only using it for calling
kfree_rcu()

Removed.


@@ -454,8 +485,10 @@ int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
list_for_each_entry_safe(iopf, next, &fault_param->partial, list)
kfree(iopf);
- param->fault_param = NULL;
- kfree(fault_param);
+ /* dec the ref owned by iopf_queue_add_device() */
+ rcu_assign_pointer(param->fault_param, NULL);
+ if (refcount_dec_and_test(&fault_param->users))
+ kfree_rcu(fault_param, rcu);

Why open code iopf_put_dev_fault_param()? Just call it.

Done.


With those:

Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>

Jason

Thank you very much!

Best regards,
baolu