Re: [RFC PATCH 00/21] iommu/amd: Introduce support for HW accelerated vIOMMU w/ nested page table

From: Suthikulpanit, Suravee
Date: Fri Jun 23 2023 - 22:09:04 EST




On 6/23/2023 3:56 PM, Jason Gunthorpe wrote:
On Fri, Jun 23, 2023 at 03:05:06PM -0700, Suthikulpanit, Suravee wrote:

For example, an AMD IOMMU hardware is normally listed as a PCI device (e.g.
PCI ID 00:00.2). To setup IOMMU PAS for this IOMMU instance, the IOMMU
driver allocate an IOMMU v1 page table for this device, which contains PAS
mapping.

So it is just system dram?

Yes, this is no different than the IOMMU page table for a particular device, contain mapping from IOMMU Private Address (IPA) to SPA. The IPA is defined in the IOMMU spec. Please see Figure 79 and 80 of this documentation for IPA mapping used by the hardware.

https://www.amd.com/system/files/TechDocs/48882_3.07_PUB.pdf

The IOMMU hardware use the PAS for storing Guest IOMMU information such as
Guest MMIOs, DevID Mapping Table, DomID Mapping Table, and Guest
Command/Event/PPR logs.

Why does it have to be in kernel memory?

Why not store the whole thing in user mapped memory and have the VMM
manipulate it directly?

The Guest MMIO, CmdBuf Dirty Status, are allocated per IOMMU instance. So, these data structure cannot be allocated by VMM. In this case, the IOMMUFD_CMD_MMIO_ACCESS might still be needed.

The DomID and DevID mapping tables are allocated per-VM:
* DomID Mapping Table (512 KB contiguous memory)
* DevID Mapping Table (1 MB contiguous memory)

Let's say we can use IOMMU_SET_DEV_DATA to communicate the memory address of Dom/DevID Mapping tables to IOMMU driver to pin and map in the PAS IOMMU page table. Then, this might work. Does that go along the line of what you are thinking (mainly to try to avoid introducing additional ioctl)?

By the way, I think I can try getting rid of the IOMMUFD_CMD_CMDBUF_UPDATE. Lemme do that in next RFC.

Thanks,
Suravee