RE: [PATCH 02/12] iommu/vt-d: Add cache tag invalidation helpers

From: Tian, Kevin
Date: Thu Mar 28 2024 - 03:40:01 EST


> From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> Sent: Monday, March 25, 2024 10:17 AM
>
> +
> +static unsigned long calculate_psi_aligned_address(unsigned long start,
> + unsigned long end,
> + unsigned long *_pages,
> + unsigned long *_mask)
> +{
> + unsigned long pages = (end - start + VTD_PAGE_SIZE - 1) >>
> VTD_PAGE_SHIFT;

this doesn't sound correct. You'd want to follow aligned_nrpages().

> + case CACHE_TAG_TYPE_DEVTLB:
> + if (tag->pasid == IOMMU_NO_PASID)
> + qi_flush_dev_iotlb(iommu, sid, info->pfsid,
> + info->ats_qdep, addr,
> mask);
> + else
> + qi_flush_dev_iotlb_pasid(iommu, sid, info-
> >pfsid,
> + tag->pasid, info-
> >ats_qdep,
> + addr, mask);
> +
> + quirk_extra_dev_tlb_flush(info, addr, mask, tag-
> >pasid, info->ats_qdep);
> + break;
> + case CACHE_TAG_TYPE_PARENT_DEVTLB:
> + /*
> + * Address translation cache in device side caches the
> + * result of nested translation. There is no easy way
> + * to identify the exact set of nested translations
> + * affected by a change in S2. So just flush the entire
> + * device cache.
> + */
> + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info-
> >ats_qdep,
> + 0, MAX_AGAW_PFN_WIDTH);
> + quirk_extra_dev_tlb_flush(info, 0,
> MAX_AGAW_PFN_WIDTH,
> + IOMMU_NO_PASID, info-
> >ats_qdep);
> + break;

parent devtlb can has pasid too, though not enabled now. As core helpers
probably we can support it now leading to simpler code:

case CACHE_TAG_TYPE_PARENT_DEVTLB:
//change addr/mask
//fallthrough
case CACHE_TAG_TYPE_DEVTLB:
//what this patch does

> +void cache_tag_flush_all(struct dmar_domain *domain)
> +{
> + struct cache_tag *tag;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&domain->cache_lock, flags);
> + list_for_each_entry(tag, &domain->cache_tags, node) {
> + struct device_domain_info *info = dev_iommu_priv_get(tag-
> >dev);
> + struct intel_iommu *iommu = tag->iommu;
> + u16 sid = PCI_DEVID(info->bus, info->devfn);
> +
> + switch (tag->type) {
> + case CACHE_TAG_TYPE_IOTLB:
> + case CACHE_TAG_TYPE_PARENT_IOTLB:
> + if (domain->use_first_level)
> + qi_flush_piotlb(iommu, tag->domain_id,
> + tag->pasid, 0, -1, 0);
> + else
> + iommu->flush.flush_iotlb(iommu, tag-
> >domain_id,
> + 0, 0,
> DMA_TLB_DSI_FLUSH);
> + break;
> + case CACHE_TAG_TYPE_DEVTLB:
> + case CACHE_TAG_TYPE_PARENT_DEVTLB:
> + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info-
> >ats_qdep,
> + 0, MAX_AGAW_PFN_WIDTH);
> + quirk_extra_dev_tlb_flush(info, 0,
> MAX_AGAW_PFN_WIDTH,
> + IOMMU_NO_PASID, info-
> >ats_qdep);
> + break;
> + }
> + }

could this loop be consolidated with the one in previous helper? anyway
the only difference is on addr/mask...

> +/*
> + * Invalidate a range of IOVA when IOMMU is in caching mode and new
> mappings
> + * are added to the target domain.
> + */
> +void cache_tag_flush_cm_range(struct dmar_domain *domain, unsigned
> long start,
> + unsigned long end)
> +{

I'm also not sure why this is worth a separate helper. why couldn't it
be managed by previous flush_range()?

> + unsigned long pages, mask, addr;
> + struct cache_tag *tag;
> + unsigned long flags;
> +
> + addr = calculate_psi_aligned_address(start, end, &pages, &mask);
> +
> + spin_lock_irqsave(&domain->cache_lock, flags);
> + list_for_each_entry(tag, &domain->cache_tags, node) {
> + struct intel_iommu *iommu = tag->iommu;
> +
> + /*
> + * When IOMMU is enabled in caching mode some non-
> present
> + * mappings may be cached by the IOTLB if it uses second-
> + * only translation.
> + */
> + if (!cap_caching_mode(iommu->cap) || domain-
> >use_first_level) {
> + iommu_flush_write_buffer(iommu);
> + continue;
> + }

the comment really doesn't tell what this block does. from its intention
it probably more suitable to be in the caller side as today.

> +
> + if (tag->type == CACHE_TAG_TYPE_IOTLB ||
> + tag->type == CACHE_TAG_TYPE_PARENT_IOTLB) {
> + /*
> + * Fallback to domain selective flush if no
> + * PSI support or the size is too big.
> + */
> + if (!cap_pgsel_inv(iommu->cap) ||
> + mask > cap_max_amask_val(iommu->cap))
> + iommu->flush.flush_iotlb(iommu, tag-
> >domain_id,
> + 0, 0,
> DMA_TLB_DSI_FLUSH);
> + else
> + iommu->flush.flush_iotlb(iommu, tag-
> >domain_id,
> + addr, mask,
> +
> DMA_TLB_PSI_FLUSH);
> + }
> + }

You skipped devtlb invalidation. yes it's not necessary based on current
nested translation design and this part is inconsistent in different paths.

but as a semantics change you may want to first make removing devtlb
invalidation a separate patch and then do this cleanup, so bisect is easier.