Re: [PATCH v8 12/12] iommu: Use refcount for fault data access

From: Baolu Lu
Date: Tue Dec 12 2023 - 00:22:36 EST


On 12/11/23 11:24 PM, Jason Gunthorpe wrote:
Also iopf_queue_remove_device() is messed up - it returns an error
code but nothing ever does anything with it 🙁 Remove functions like
this should never fail.

Yes, agreed.


Removal should be like I explained earlier:
- Disable new PRI reception

This could be done by

rcu_assign_pointer(param->fault_param, NULL);

?

- Ack all outstanding PRQ to the device

All outstanding page requests are responded with
IOMMU_PAGE_RESP_INVALID, indicating that device should not attempt any
retry.

- Disable PRI on the device
- Tear down the iopf infrastructure

So under this model if the iopf_queue_remove_device() has been called
it should be sort of a 'disassociate' action where fault_param is
still floating out there but iommu_page_response() does nothing.

Yes. All pending requests have been auto-responded.

IOW pass the refcount from the iommu_report_device_fault() down into
the fault handler, into the work and then into iommu_page_response()
which will ultimately put it back.

Yes.

Best regards,
baolu