Re: [PATCH v12 4/6] iommu/s390: Disable deferred flush for ISM devices

From: Matthew Rosato
Date: Fri Aug 25 2023 - 14:25:52 EST


On 8/25/23 6:11 AM, Niklas Schnelle wrote:
> ISM devices are virtual PCI devices used for cross-LPAR communication.
> Unlike real PCI devices ISM devices do not use the hardware IOMMU but
> inspects IOMMU translation tables directly on IOTLB flush (s390 RPCIT
> instruction).
>
> ISM devices keep their DMA allocations static and only very rarely DMA
> unmap at all. For each IOTLB flush that occurs after unmap the ISM
> devices will however inspect the area of the IOVA space indicated by the
> flush. This means that for the global IOTLB flushes used by the flush
> queue mechanism the entire IOVA space would be inspected. In principle
> this would be fine, albeit potentially unnecessarily slow, it turns out
> however that ISM devices are sensitive to seeing IOVA addresses that are
> currently in use in the IOVA range being flushed. Seeing such in-use
> IOVA addresses will cause the ISM device to enter an error state and
> become unusable.
>
> Fix this by claiming IOMMU_CAP_DEFERRED_FLUSH only for non-ISM devices.
> This makes sure IOTLB flushes only cover IOVAs that have been unmapped
> and also restricts the range of the IOTLB flush potentially reducing
> latency spikes.
>
> Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>

Reviewed-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>

> ---
> drivers/iommu/s390-iommu.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index f6d6c60e5634..8310180a102c 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -315,11 +315,13 @@ static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
>
> static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
> {
> + struct zpci_dev *zdev = to_zpci_dev(dev);
> +
> switch (cap) {
> case IOMMU_CAP_CACHE_COHERENCY:
> return true;
> case IOMMU_CAP_DEFERRED_FLUSH:
> - return true;
> + return zdev->pft != PCI_FUNC_TYPE_ISM;
> default:
> return false;
> }
>