Re: [PATCH v2 2/2] iommu/vt-d: Fix NULL domain on device release

From: Baolu Lu
Date: Mon Mar 04 2024 - 04:39:15 EST


On 2024/3/4 16:59, Tian, Kevin wrote:
From: Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx>
Sent: Monday, March 4, 2024 4:07 PM

On 2024/3/4 15:36, Tian, Kevin wrote:
From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
Sent: Thursday, February 29, 2024 5:46 PM

+
+/*
+ * Cache invalidation for changes to a scalable-mode context table
+ * entry.
+ *
+ * Section 6.5.3.3 of the VT-d spec:
+ * - Device-selective context-cache invalidation;
+ * - Domain-selective PASID-cache invalidation to affected domains
+ * (can be skipped if all PASID entries were not-present);
+ * - Domain-selective IOTLB invalidation to affected domains;

the spec talks about domain-selective but the code actually does
global invalidation.

I should have included the following comments below:

/* Given that we have no idea about which domain IDs and PASIDs were
* used in the pasid table, upgrade them to global PASID and IOTLB
* cache invalidation. This doesn't impact the performance significantly
* as the clearing context entry is not a critical path.
*/


but then it affects all other perf-critical paths which rely on the cache
for other devices...

You are right. Good consideration.


It's preferable to restrict overhead to this release path only e.g. walking
the PASID table to identify affected DIDs and PASIDs instead of expanding
the impact to system wide.

The sm_context_flush_caches() could be used in two different paths:
- Deferred attachment case;
- Normal device release path.

For the formal case, we have to use global cache invalidation; but the
the latter case, it's fine to skip these cache invalidation. The new
helper probably looks like below.

/*
* Cache invalidation for changes to a scalable-mode context table
* entry.
*
* Section 6.5.3.3 of the VT-d spec:
* - Device-selective context-cache invalidation;
* - Domain-selective PASID-cache invalidation to affected domains
* (can be skipped if all PASID entries were not-present);
* - Domain-selective IOTLB invalidation to affected domains;
* - Global Device-TLB invalidation to affected functions.
*
* For kdump cases, old valid entries may be cached due to the in-flight
* DMA and copied pgtable, but there is no unmapping behaviour for them,
* thus we need explicit cache flushes for all affected domain IDs and
* PASIDs used in the copied PASID table. Given that we have no idea about
* which domain IDs and PASIDs were used in the copied tables, upgrade
* them to global PASID and IOTLB cache invalidation.
*
* For normal case, the iommu has been parked in blocking state. All PASID
* entries are in non-present now. Skip PASID and IOTLB cache invalidation.
*/
static void sm_context_flush_caches(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;

iommu->flush.flush_context(iommu, 0, PCI_DEVID(info->bus, info->devfn),
DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL);
if (context_copied(iommu, info->bus, info->devfn)) {
qi_flush_pasid_cache(iommu, 0, QI_PC_GLOBAL, 0);
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
}
devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID);
}

Does it look good for you?

Best regards,
baolu