Re: [PATCH 04/11] PCI: portdrv: Suppress kernel DMA ownership auto-claiming

From: Bjorn Helgaas
Date: Mon Nov 15 2021 - 19:07:19 EST


On Mon, Nov 15, 2021 at 10:05:45AM +0800, Lu Baolu wrote:
> IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
> then all of the downstream devices will be part of the same IOMMU group
> as the bridge.

I think this means something like: "If a PCIe Switch Downstream Port
lacks <a specific set of ACS capabilities>, all downstream devices
will be part of the same IOMMU group as the switch," right?

If so, can you fill in the details to make it specific and concrete?

> As long as the bridge kernel driver doesn't map and
> access any PCI mmio bar, it's safe to bind it to the device in a USER-
> owned group. Hence, safe to suppress the kernel DMA ownership auto-
> claiming.

s/mmio/MMIO/ (also below)
s/bar/BAR/ (also below)

I don't understand what "kernel DMA ownership auto-claiming" means.
Presumably that's explained in previous patches and a code comment
near "suppress_auto_claim_dma_owner".

> The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") permitted a
> class of kernel drivers.

Permitted them to do what?

> This is not always safe. For example, the SHPC
> system design requires that it must be integrated into a PCI-to-PCI
> bridge or a host bridge.

If this SHPC example is important, it would be nice to have a citation
to the spec section that requires this.

> The shpchp_core driver relies on the PCI mmio
> bar access for the controller functionality. Binding it to the device
> belonging to a USER-owned group will allow the user to change the
> controller via p2p transactions which is unknown to the hot-plug driver
> and could lead to some unpredictable consequences.
>
> Now that we have driver self-declaration of safety we should rely on that.

Can you spell out what drivers are self-declaring? Are they declaring
that they don't program their devices to do DMA?

> This change may cause regression on some platforms, since all bridges were
> exempted before, but now they have to be manually audited before doing so.
> This is actually the desired outcome anyway.

Please spell out what regression this may cause and how users would
recognize it. Also, please give a hint about why that is desirable.

> Suggested-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Suggested-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> ---
> drivers/pci/pcie/portdrv_pci.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index 35eca6277a96..1285862a9aa8 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -203,6 +203,8 @@ static struct pci_driver pcie_portdriver = {
> .err_handler = &pcie_portdrv_err_handler,
>
> .driver.pm = PCIE_PORTDRV_PM_OPS,
> +
> + .driver.suppress_auto_claim_dma_owner = true,
> };
>
> static int __init dmi_pcie_pme_disable_msi(const struct dmi_system_id *d)
> --
> 2.25.1
>