RE: [PATCH v8 3/5] iommufd: Add IOMMU_GET_HW_INFO

From: Tian, Kevin
Date: Thu Aug 17 2023 - 03:32:39 EST


> From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> Sent: Wednesday, August 16, 2023 8:14 PM
>
> Under nested IOMMU translation, userspace owns the stage-1 translation
> table (e.g. the stage-1 page table of Intel VT-d or the context table of
> ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific, and
> need to be compatible with the underlying IOMMU hardware. Hence,
> userspace
> should know the IOMMU hardware capability before creating and
> configuring
> the stage-1 translation table to kernel.
>
> This adds IOMMU_GET_HW_INFO ioctl to query the IOMMU hardware
> information
> (a.k.a capability) for a given device. The returned data is vendor
> specific, userspace needs to decode it with the structure by the output
> @out_data_type field.

"The format of the returned data is vendor specific and must be decoded
according to @out_data_type field".

> +
> +int iommufd_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> + struct iommu_hw_info *cmd = ucmd->cmd;
> + void __user *user_ptr = u64_to_user_ptr(cmd->data_uptr);
> + const struct iommu_ops *ops;
> + struct iommufd_device *idev;
> + unsigned int data_len;
> + unsigned int copy_len;
> + void *data = NULL;
> + int rc;
> +
> + if (cmd->flags || cmd->__reserved)
> + return -EOPNOTSUPP;
> +
> + idev = iommufd_get_device(ucmd, cmd->dev_id);
> + if (IS_ERR(idev))
> + return PTR_ERR(idev);
> +
> + ops = dev_iommu_ops(idev->dev);
> + if (ops->hw_info) {
> + data = ops->hw_info(idev->dev, &data_len, &cmd-
> >out_data_type);
> + if (IS_ERR(data)) {
> + rc = PTR_ERR(data);
> + goto err_put;
> + }
> +
> + /*
> + * drivers that have hw_info callback should have a unique
> + * iommu_hw_info_type.
> + */
> + if (WARN_ON_ONCE(cmd->out_data_type ==
> + IOMMU_HW_INFO_TYPE_NONE)) {
> + rc = -ENODEV;
> + goto out;
> + }
> + } else {
> + cmd->out_data_type = IOMMU_HW_INFO_TYPE_NONE;
> + data_len = 0;
> + data = NULL;

data is already initialized as NULL.

> +
> + /*
> + * We return the length the kernel supports so userspace may know
> what
> + * the kernel capability is. It could be larger than the input buffer.
> + */
> + cmd->data_len = data_len;
> +
> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> +out:

out_free:

> + kfree(data);
> +err_put:

out_put: (since this is also used in the success path)

> + * To capture an iommu type specific hardware information data,
> @data_uptr and
> + * its length @data_len must be provided. Trailing bytes will be zeroed if the
> + * user buffer is larger than the data that kernel has. Otherwise, kernel only
> + * fills the buffer using the given length in @data_len. If the ioctl succeeds,
> + * @data_len will be updated to the length that kernel actually supports,
> + * @out_data_type will be filled to decode the data filled in the buffer
> + * pointed by @data_uptr. Input @data_len == zero is allowed, no
> information
> + * data will be filled to user, but user space could get the
> iommu_hw_info_type
> + * filled in @out_data_type and the iommu hardware information data
> length
> + * supported by kernel filled in @data_len.

I'd just keep "Input @data_len == zero is allowed" and remove all the
trailing words which just duplicate with the former context.

Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx>