Re: [PATCH v6 0/4] Add parameter for disabling ACS redirection for P2P

From: Alex Williamson
Date: Mon Jul 16 2018 - 11:06:26 EST


On Mon, 16 Jul 2018 15:01:21 +1000
Alexey Kardashevskiy <aik@xxxxxxxxx> wrote:

> On 14/7/18 9:31 am, Logan Gunthorpe wrote:
> > Changes since v5:
> > * Add a quirk to handle the Intel SPT PCH case (as pointed out by Alex)
> > * Warn in the case that we try to disable ACS redirect on a device
> > that doesn't have the ACS capability (also suggested by Alex)
> > * Collect reviewed-by tag from Alex
> > * Rebased onto v4.18-rc4 (no conflicts)
> >
> > Changes since v4:
> > * Fixed a couple documentation mistakes spotted by Randy
> >
> > Changes since v3:
> > * Removed some of the cruft that was copied from the resource_alignment
> > paramater (per Alex)
> > * A number of docuemntation fixes as noticed by Alex and Willy
> >
> > Changes since v2:
> > * Rebased onto v4.18-rc1 (no conflicts)
> > * Minor tweaks to the documentation per Andy
> > * Removed the "path:" prefix and use the path parsing code
> > for simple devices (as it works the same). Per a suggestion from Alex
> >
> > Changes since v1:
> > * Reworked pci_dev_str_match_path using strrchr as suggested by Alex
> > * Collected Christian's Acks
> >
> > --
> >
> > Hi,
> >
> > As discussed in our PCI P2PDMA series, we'd like to add a kernel
> > parameter for selectively disabling ACS redirection for select
> > bridges. Seeing this turned out to be a small series in itself, we've
> > decided to send this separately from the P2P work.
> >
> > This series generalizes the code already done for the resource_alignment
> > option that already exists. The first patch creates a helper function
> > to match PCI devices against strings based on the code that already
> > existed in pci_specified_resource_alignment().
> >
> > The second patch expands the new helper to optionally take a path of
> > PCI devfns. This is to address Alex's renumbering concern when using
> > simple bus-devfns. The implementation is essentially how he described it and
> > similar to the Intel VT-d spec (Section 8.3.1).
> >
> > The final patch adds the disable_acs_redir kernel parameter which takes
> > a list of PCI devices and will disable the ACS P2P Request Redirect,
> > ACS P2P Completion Redirect and ACS P2P Egress Control bits for the
> > selected devices. This allows P2P traffic between selected bridges and
> > seeing it's done at boot, before the IOMMU groups will be created, the
> > groups will match the security provided by ACS.
>
>
> I am pretty sure it's been discussed but just to make sure I understand the
> whole picture - why exactly does ACS have to be disabled at the boot time?
> We could enable it, for example, for 2 devices in the same VFIO container
> if there are in isolatable part of the PCI tree, or we just do not want to
> make VFIO containers or QEMU aware of PCI hierarchy (I can see why, just
> double checking)? Thanks.

AIUI, vfio is not necessarily a primary use case here, native bare
metal drivers might also want to perform direct p2p. In the vfio case,
any time we're allowing p2p via ACS, we're poking holes into the IOVA
space presented to the user. We don't have a good way for the user to
handle that, or even learn about it, so there are quite a few issues if
vfio were a use case here. Currently the intersection with vfio is
that when ACS is disabled, it introduces p2p channels which breaks
device isolation. These need to be reflected in the IOMMU groups so
it's done at boot time, before the groups are created. If we wanted to
allow dynamic manipulation, we'd effectively need to soft unplug entire
sub-hierarchies around the point where ACS is modified and re-add the
devices in order to get the grouping correct. Thanks,

Alex