Re: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level

From: Lu Baolu
Date: Fri Dec 13 2019 - 22:25:18 EST


Hi Liu Yi,

On 12/13/19 7:42 PM, Liu, Yi L wrote:
From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On Behalf
Of Lu Baolu
Sent: Wednesday, December 11, 2019 10:12 AM
To: Joerg Roedel <joro@xxxxxxxxxx>; David Woodhouse <dwmw2@xxxxxxxxxxxxx>;
Subject: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level

When software has changed first-level tables, it should invalidate
the affected IOTLB and the paging-structure-caches using the PASID-
based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.

Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/iommu/dmar.c | 41 ++++++++++++++++++++++++++++++++++
drivers/iommu/intel-iommu.c | 44 ++++++++++++++++++++++++-------------
include/linux/intel-iommu.h | 2 ++
3 files changed, 72 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 3acfa6a25fa2..fb30d5053664 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1371,6 +1371,47 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16
sid, u16 pfsid,
qi_submit_sync(&desc, iommu);
}

+/* PASID-based IOTLB invalidation */
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
+ unsigned long npages, bool ih)
+{
+ struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
+
+ /*
+ * npages == -1 means a PASID-selective invalidation, otherwise,
+ * a positive value for Page-selective-within-PASID invalidation.
+ * 0 is not a valid input.
+ */
+ if (WARN_ON(!npages)) {
+ pr_err("Invalid input npages = %ld\n", npages);
+ return;
+ }
+
+ if (npages == -1) {
+ desc.qw0 = QI_EIOTLB_PASID(pasid) |
+ QI_EIOTLB_DID(did) |
+ QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
+ QI_EIOTLB_TYPE;
+ desc.qw1 = 0;
+ } else {
+ int mask = ilog2(__roundup_pow_of_two(npages));
+ unsigned long align = (1ULL << (VTD_PAGE_SHIFT + mask));
+
+ if (WARN_ON_ONCE(!ALIGN(addr, align)))
+ addr &= ~(align - 1);
+
+ desc.qw0 = QI_EIOTLB_PASID(pasid) |
+ QI_EIOTLB_DID(did) |
+ QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) |
+ QI_EIOTLB_TYPE;
+ desc.qw1 = QI_EIOTLB_ADDR(addr) |
+ QI_EIOTLB_IH(ih) |
+ QI_EIOTLB_AM(mask);
+ }
+
+ qi_submit_sync(&desc, iommu);
+}
+
/*
* Disable Queued Invalidation interface.
*/
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 83a7abf0c4f0..e47f5fe37b59 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1520,18 +1520,24 @@ static void iommu_flush_iotlb_psi(struct intel_iommu
*iommu,

if (ih)
ih = 1 << 6;
- /*
- * Fallback to domain selective flush if no PSI support or the size is
- * too big.
- * PSI requires page size to be 2 ^ x, and the base address is naturally
- * aligned to the size
- */
- if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu-
cap))
- iommu->flush.flush_iotlb(iommu, did, 0, 0,
- DMA_TLB_DSI_FLUSH);
- else
- iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
- DMA_TLB_PSI_FLUSH);
+
+ if (domain_use_first_level(domain)) {
+ qi_flush_piotlb(iommu, did, domain->default_pasid,
+ addr, pages, ih);

I'm not sure if my understanding is correct. But let me tell a story.
Assuming we assign a mdev and a PF/VF to a single VM, then there
will be p_iotlb tagged with PASID_RID2PASID and p_iotlb tagged with
default_pasid. We may want to flush both... If this operation is

I assume that SRIOV and SIOV are exclusive. You can't enable both SRIOV
and SIOV on a single device. So the mdev and PF/VF are from different
devices, right?

Or, in SRIOV case, you can wrap a PF or VF as a mediated device. But
this mdev still be backed with a pasid of RID2PASID.

invoked per-device, then need to pass in a hint to indicate whether
to use PASID_RID2PASID or default_pasid, or you may just issue two
flush with the two PASID values. Thoughts?

This is per-domain and each domain has specific domain id and default
pasid (assume default domain is 0 in RID2PASID case).

Best regards,
baolu