Re: [PATCH v12 13/16] iommu: Improve iopf_queue_remove_device()
From: Vasant Hegde
Date: Thu Feb 08 2024 - 00:07:14 EST
On 2/8/2024 7:02 AM, Baolu Lu wrote:
> On 2024/2/8 1:59, Vasant Hegde wrote:
>> Hi Baolu,
>>
>> On 2/7/2024 5:59 PM, Baolu Lu wrote:
>>> On 2024/2/7 10:50, Tian, Kevin wrote:
>>>>> From: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
>>>>> Sent: Wednesday, February 7, 2024 9:33 AM
>>>>>
>>>>> Convert iopf_queue_remove_device() to return void instead of an error code,
>>>>> as the return value is never used. This removal helper is designed to be
>>>>> never-failed, so there's no need for error handling.
>>>>>
>>>>> Ack all outstanding page requests from the device with the response code of
>>>>> IOMMU_PAGE_RESP_INVALID, indicating device should not attempt any retry.
>>>>>
>>>>> Add comments to this helper explaining the steps involved in removing a
>>>>> device from the iopf queue and disabling its PRI. The individual drivers
>>>>> are expected to be adjusted accordingly. Here we just define the expected
>>>>> behaviors of the individual iommu driver from the core's perspective.
>>>>>
>>>>> Suggested-by: Jason Gunthorpe<jgg@xxxxxxxxxx>
>>>>> Signed-off-by: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
>>>>> Reviewed-by: Jason Gunthorpe<jgg@xxxxxxxxxx>
>>>>> Tested-by: Yan Zhao<yan.y.zhao@xxxxxxxxx>
>>>> Reviewed-by: Kevin Tian<kevin.tian@xxxxxxxxx>, with one nit:
>>>>
>>>>> + * Removing a device from an iopf_queue. It's recommended to follow
>>>>> these
>>>>> + * steps when removing a device:
>>>>> *
>>>>> - * Return: 0 on success and <0 on error.
>>>>> + * - Disable new PRI reception: Turn off PRI generation in the IOMMU
>>>>> hardware
>>>>> + * and flush any hardware page request queues. This should be done
>>>>> before
>>>>> + * calling into this helper.
>>>>> + * - Acknowledge all outstanding PRQs to the device: Respond to all
>>>>> outstanding
>>>>> + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device
>>>>> should
>>>>> + * not retry. This helper function handles this.
>>>> this implies calling iopf_queue_remove_device() here.
>>>>
>>>>> + * - Disable PRI on the device: After calling this helper, the caller could
>>>>> + * then disable PRI on the device.
>>>>> + * - Call iopf_queue_remove_device(): Calling iopf_queue_remove_device()
>>>>> + * essentially disassociates the device. The fault_param might still exist,
>>>>> + * but iommu_page_response() will do nothing. The device fault parameter
>>>>> + * reference count has been properly passed from
>>>>> iommu_report_device_fault()
>>>>> + * to the fault handling work, and will eventually be released after
>>>>> + * iommu_page_response().
>>>>> */
>>>> but here it suggests calling iopf_queue_remove_device() again. If the comment
>>>> is just about to detail the behavior with that invocation shouldn't it be
>>>> merged
>>>> with the previous one instead of pretending to be the final step for driver
>>>> to call?
>>>
>>> Above just explains the behavior of calling iopf_queue_remove_device().
>>
>> Can you please leave a line -OR- move this to previous para? Otherwise we will
>> get confused.
>
> Sure. I will make it look like below.
>
> /**
> * iopf_queue_remove_device - Remove producer from fault queue
> * @queue: IOPF queue
> * @dev: device to remove
> *
> * Removing a device from an iopf_queue. It's recommended to follow these
> * steps when removing a device:
> *
> * - Disable new PRI reception: Turn off PRI generation in the IOMMU hardware
> * and flush any hardware page request queues. This should be done before
> * calling into this helper.
> * - Acknowledge all outstanding PRQs to the device: Respond to all outstanding
> * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device should
> * not retry. This helper function handles this.
> * - Disable PRI on the device: After calling this helper, the caller could
> * then disable PRI on the device.
> *
> * Calling iopf_queue_remove_device() essentially disassociates the device.
> * The fault_param might still exist, but iommu_page_response() will do
> * nothing. The device fault parameter reference count has been properly
> * passed from iommu_report_device_fault() to the fault handling work, and
> * will eventually be released after iommu_page_response().
> */
Looks good. Thank you.
-Vasant