Re: [PATCH v2] iommu/vt-d: Add a fix for devices need extra dtlb flush

From: Baolu Lu
Date: Tue Nov 29 2022 - 19:11:22 EST


Hi Jacob,

On 2022/11/29 1:04, Jacob Pan wrote:
QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
address translation service (ATS). These devices may inadvertently issue
ATS invalidation completion before posted writes initiated with
translated address that utilized translations matching the invalidation
address range, violating the invalidation completion ordering.

This patch adds an extra device TLB invalidation for the affected devices,
it is needed to ensure no more posted writes with translated address
following the invalidation completion. Therefore, the ordering is
preserved and data-corruption is prevented.

Device TLBs are invalidated under the following six conditions:
1. Device driver does DMA API unmap IOVA
2. Device driver unbind a PASID from a process, sva_unbind_device()
3. PASID is torn down, after PASID cache is flushed. e.g. process
exit_mmap() due to crash
4. Under SVA usage, called by mmu_notifier.invalidate_range() where
VM has to free pages that were unmapped
5. userspace driver unmaps a DMA buffer
6. Cache invalidation in vSVA usage (upcoming)

For #1 and #2, device drivers are responsible for stopping DMA traffic
before unmap/unbind. For #3, iommu driver gets mmu_notifier to
invalidate TLB the same way as normal user unmap which will do an extra
invalidation. The dTLB invalidation after PASID cache flush does not
need an extra invalidation.

Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
covered by this patch due to common code path with #5.

Tested-by: Yuzhang Luo <yuzhang.luo@xxxxxxxxx>
Signed-off-by: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
Reviewed-by: Ashok Raj <ashok.raj@xxxxxxxxx>
---
v2
- removed risky_device() check based on the review by Robin, added comments
to explain the exemption.
- reworked commit message based on the review from Ashok
---
drivers/iommu/intel/iommu.c | 67 +++++++++++++++++++++++++++++++++++--
drivers/iommu/intel/iommu.h | 3 ++
drivers/iommu/intel/svm.c | 4 ++-
3 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 996a8b5ee5ee..a97e9247037f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1396,6 +1396,23 @@ static void domain_update_iotlb(struct dmar_domain *domain)
spin_unlock_irqrestore(&domain->lock, flags);
}
+/*
+ * Impacted QAT device IDs ranging from 0x4940 to 0x4943.
+ * This quirk is exempted from risky_device() check because it applies only
+ * to the built-in QAT devices.
+ */

How about adding:

"This quirk is exempted from risky_device() check because it doesn't
effectively grant additional privileges."

?

--
Best regards,
baolu