[GIT PULL] Please pull IOMMUFD subsystem changes

From: Jason Gunthorpe
Date: Wed Aug 30 2023 - 19:41:01 EST


Hi Linus,

This PR includes several of the items that have been in progress for quite
some time now, details in the tag.

For those following, these series are still progressing:

- User page table invalidation:
https://lore.kernel.org/all/20230724110406.107212-1-yi.l.liu@xxxxxxxxx/

- Intel VT-d nested translation:
https://lore.kernel.org/all/20230724111335.107427-1-yi.l.liu@xxxxxxxxx/

- ARM SMMv3 nested translation:
https://lore.kernel.org/all/cover.1683688960.git.nicolinc@xxxxxxxxxx/

- Draft AMD IOMMU nested translation:
https://lore.kernel.org/all/20230621235508.113949-1-suravee.suthikulpanit@xxxxxxx/

There is also alot of ongoing work to generically enable PASID support in all
the IOMMU drivers:
SMMUv3:
https://lore.kernel.org/linux-iommu/20230621063825.268890-1-mshavit@xxxxxxxxxx/
AMD:
https://lore.kernel.org/all/20230821104227.706997-1-vasant.hegde@xxxxxxx/
https://lore.kernel.org/all/20230821104956.707235-1-vasant.hegde@xxxxxxx/
https://lore.kernel.org/all/20230816174031.634453-1-vasant.hegde@xxxxxxx/

Which will see exposure through the iommufd uAPI soon.

Along with qemu patches implementing iommufd:
https://lore.kernel.org/all/20230830103754.36461-1-zhenzhong.duan@xxxxxxxxx/

Draft patches for the qemu side support for nested translation support in
the vIOMMU drivers are linked from the above.

Thanks,
Jason

The following changes since commit 2ccdd1b13c591d306f0401d98dedc4bdcd02b421:

Linux 6.5-rc6 (2023-08-13 11:29:55 -0700)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git tags/for-linus-iommufd

for you to fetch changes up to eb501c2d96cfce6b42528e8321ea085ec605e790:

iommufd/selftest: Don't leak the platform device memory when unloading the module (2023-08-18 12:56:24 -0300)

----------------------------------------------------------------
iommufd for 6.6

This includes a shared branch with VFIO:

- Enhance VFIO_DEVICE_GET_PCI_HOT_RESET_INFO so it can work with iommufd
FDs, not just group FDs. This removes the last place in the uAPI that
required the group fd.

- Give VFIO a new device node /dev/vfio/devices/vfioX (the so called cdev
node) which is very similar to the FD from VFIO_GROUP_GET_DEVICE_FD.
The cdev is associated with the struct device that the VFIO driver is
bound to and shows up in sysfs in the normal way.

- Add a cdev IOCTL VFIO_DEVICE_BIND_IOMMUFD which allows a newly opened
/dev/vfio/devices/vfioX to be associated with an IOMMUFD, this replaces
the VFIO_GROUP_SET_CONTAINER flow.

- Add cdev IOCTLs VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT to allow the IOMMU
translation the vfio_device is associated with to be changed. This is a
significant new feature for VFIO as previously each vfio_device was
fixed to a single translation.

The translation is under the control of iommufd, so it can be any of
the different translation modes that iommufd is learning to create.

At this point VFIO has compilation options to remove the legacy interfaces
and in modern mode it behaves like a normal driver subsystem. The
/dev/vfio/iommu and /dev/vfio/groupX nodes are not present and each
vfio_device only has a /dev/vfio/devices/vfioX cdev node that represents
the device.

On top of this is built some of the new iommufd functionality:

- IOMMU_HWPT_ALLOC allows userspace to directly create the low level
IO Page table objects and affiliate them with IOAS objects that hold
the translation mapping. This is the basic functionality for the
normal IOMMU_DOMAIN_PAGING domains.

- VFIO_DEVICE_ATTACH_IOMMUFD_PT can be used to replace the current
translation. This is wired up to through all the layers down to the
driver so the driver has the ability to implement a hitless
replacement. This is necessary to fully support guest behaviors when
emulating HW (eg guest atomic change of translation)

- IOMMU_GET_HW_INFO returns information about the IOMMU driver HW that
owns a VFIO device. This includes support for the Intel iommu, and
patches have been posted for all the other server IOMMU.

Along the way are a number of internal items:

- New iommufd kapis iommufd_ctx_has_group(), iommufd_device_to_ictx(),
iommufd_device_to_id(), iommufd_access_detach(), iommufd_ctx_from_fd(),
iommufd_device_replace()

- iommufd now internally tracks iommu_groups as it needs some per-group
data

- Reorganize how the internal hwpt allocation flows to have more robust
locking

- Improve the access interfaces to support detach and replace of an IOAS
from an access

- New selftests and a rework of how the selftests creates a mock iommu
driver to be more like a real iommu driver

----------------------------------------------------------------
Jason Gunthorpe (21):
Merge branch 'v6.6/vfio/cdev' of https://github.com/awilliam/linux-vfio into iommufd for-next
iommufd: Move isolated msi enforcement to iommufd_device_bind()
iommufd: Add iommufd_group
iommufd: Replace the hwpt->devices list with iommufd_group
iommu: Export iommu_get_resv_regions()
iommufd: Keep track of each device's reserved regions instead of groups
iommufd: Use the iommufd_group to avoid duplicate MSI setup
iommufd: Make sw_msi_start a group global
iommufd: Move putting a hwpt to a helper function
iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()
iommufd: Allow a hwpt to be aborted after allocation
iommufd: Fix locking around hwpt allocation
iommufd: Reorganize iommufd_device_attach into iommufd_device_change_pt
iommufd: Add iommufd_device_replace()
iommufd: Make destroy_rwsem use a lock class per object type
iommufd: Add IOMMU_HWPT_ALLOC
iommufd/selftest: Return the real idev id from selftest mock_domain
iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
iommufd/selftest: Make the mock iommu driver into a real driver
Merge tag 'v6.5-rc6' into iommufd for-next
iommufd: Remove iommufd_ref_to_users()

Lu Baolu (1):
iommu: Add new iommu op to get iommu hardware information

Nicolin Chen (11):
iommufd/device: Add iommufd_access_detach() API
iommu: Introduce a new iommu_group_replace_domain() API
iommufd/selftest: Test iommufd_device_replace()
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
iommufd: Add iommufd_access_change_ioas(_id) helpers
iommufd: Use iommufd_access_change_ioas in iommufd_access_destroy_object
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl

Yang Yingliang (1):
iommufd/selftest: Don't leak the platform device memory when unloading the module

Yi Liu (38):
vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset()
vfio/pci: Move the existing hot reset logic to be a helper
iommufd: Reserve all negative IDs in the iommufd xarray
iommufd: Add iommufd_ctx_has_group()
iommufd: Add helper to retrieve iommufd_ctx and devid
vfio: Mark cdev usage in vfio_device
vfio: Add helper to search vfio_device in a dev_set
vfio/pci: Extend VFIO_DEVICE_GET_PCI_HOT_RESET_INFO for vfio device cdev
vfio/pci: Copy hot-reset device info to userspace in the devices loop
vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
vfio: Allocate per device file structure
vfio: Refine vfio file kAPIs for KVM
vfio: Accept vfio device file in the KVM facing kAPI
kvm/vfio: Prepare for accepting vfio device fd
kvm/vfio: Accept vfio device file from userspace
vfio: Pass struct vfio_device_file * to vfio_device_open/close()
vfio: Block device access via device fd until device is opened
vfio: Add cdev_device_open_cnt to vfio_group
vfio: Make vfio_df_open() single open for device cdev path
vfio-iommufd: Move noiommu compat validation out of vfio_iommufd_bind()
vfio-iommufd: Split bind/attach into two steps
vfio: Record devid in vfio_device_file
vfio-iommufd: Add detach_ioas support for physical VFIO devices
vfio-iommufd: Add detach_ioas support for emulated VFIO devices
vfio: Move vfio_device_group_unregister() to be the first operation in unregister
vfio: Move device_del() before waiting for the last vfio_device registration refcount
vfio: Add cdev for vfio_device
vfio: Test kvm pointer in _vfio_device_get_kvm_safe()
iommufd: Add iommufd_ctx_from_fd()
vfio: Avoid repeated user pointer cast in vfio_device_fops_unl_ioctl()
vfio: Add VFIO_DEVICE_BIND_IOMMUFD
vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT
vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev()
vfio: Compile vfio_group infrastructure optionally
docs: vfio: Add vfio device cdev description
iommu: Move dev_iommu_ops() to private header
iommufd: Add IOMMU_GET_HW_INFO
iommu/vt-d: Implement hw_info for iommu capability query

Documentation/driver-api/vfio.rst | 147 ++++-
Documentation/virt/kvm/devices/vfio.rst | 47 +-
drivers/gpu/drm/i915/gvt/kvmgt.c | 1 +
drivers/iommu/intel/iommu.c | 19 +
drivers/iommu/iommu-priv.h | 30 +
drivers/iommu/iommu.c | 81 ++-
drivers/iommu/iommufd/Kconfig | 4 +-
drivers/iommu/iommufd/device.c | 801 ++++++++++++++++++-----
drivers/iommu/iommufd/hw_pagetable.c | 112 +++-
drivers/iommu/iommufd/io_pagetable.c | 36 +-
drivers/iommu/iommufd/iommufd_private.h | 86 +--
drivers/iommu/iommufd/iommufd_test.h | 19 +
drivers/iommu/iommufd/main.c | 61 +-
drivers/iommu/iommufd/selftest.c | 213 ++++--
drivers/s390/cio/vfio_ccw_ops.c | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 1 +
drivers/vfio/Kconfig | 27 +
drivers/vfio/Makefile | 3 +-
drivers/vfio/device_cdev.c | 228 +++++++
drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 +
drivers/vfio/group.c | 173 +++--
drivers/vfio/iommufd.c | 145 +++-
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 2 +
drivers/vfio/pci/mlx5/main.c | 1 +
drivers/vfio/pci/vfio_pci.c | 1 +
drivers/vfio/pci/vfio_pci_core.c | 250 ++++---
drivers/vfio/platform/vfio_amba.c | 1 +
drivers/vfio/platform/vfio_platform.c | 1 +
drivers/vfio/vfio.h | 218 +++++-
drivers/vfio/vfio_main.c | 258 +++++++-
include/linux/iommu.h | 16 +-
include/linux/iommufd.h | 9 +
include/linux/vfio.h | 66 +-
include/uapi/linux/iommufd.h | 97 +++
include/uapi/linux/kvm.h | 13 +-
include/uapi/linux/vfio.h | 148 ++++-
samples/vfio-mdev/mbochs.c | 1 +
samples/vfio-mdev/mdpy.c | 1 +
samples/vfio-mdev/mtty.c | 1 +
tools/testing/selftests/iommu/iommufd.c | 130 +++-
tools/testing/selftests/iommu/iommufd_fail_nth.c | 71 +-
tools/testing/selftests/iommu/iommufd_utils.h | 144 +++-
virt/kvm/vfio.c | 137 ++--
43 files changed, 3130 insertions(+), 672 deletions(-)
create mode 100644 drivers/iommu/iommu-priv.h
create mode 100644 drivers/vfio/device_cdev.c

Attachment: signature.asc
Description: PGP signature