Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use system cache

From: isaacm
Date: Fri Jan 08 2021 - 13:10:15 EST


On 2021-01-07 21:47, Sai Prakash Ranjan wrote:
On 2021-01-07 22:27, isaacm@xxxxxxxxxxxxxx wrote:
On 2021-01-06 03:56, Will Deacon wrote:
On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag")
removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went
the memory type setting required for the non-coherent masters to use
system cache. Now that system cache support for GPU is added, we will
need to mark the memory as normal sys-cached for GPU to use system cache.
Without this, the system cache lines are not allocated for GPU. We use
the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page protection
flag as the flag cannot be exposed via DMA api because of no in-tree
users.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@xxxxxxxxxxxxxx>
---
drivers/iommu/io-pgtable-arm.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 7c9ea9d7874a..3fb7de8304a2 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
else if (prot & IOMMU_CACHE)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
+ else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
+ pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
+ << ARM_LPAE_PTE_ATTRINDX_SHIFT);
}

While this approach of enabling system cache globally for both page
tables and other buffers
works for the GPU usecase, this isn't ideal for other clients that use
system cache. For example,
video clients only want to cache a subset of their buffers in the
system cache, due to the sizing constraint
imposed by how much of the system cache they can use. So, it would be
ideal to have
a way of expressing the desire to use the system cache on a per-buffer
basis. Additionally,
our video clients use the DMA layer, and since the requirement is for
caching in the system cache
to be a per buffer attribute, it seems like we would have to have a
DMA attribute to express
this on a per-buffer basis.


I did bring this up initially [1], also where is this video client
in upstream? AFAIK, only system cache user in upstream is GPU.
We cannot add any DMA attribute unless there is any user upstream
Right, there wouldn't be an upstream user, which would be problematic,
but I was thinking of having it so that when video or any of our other
clients that use this attribute on a per buffer basis upstreams their
code, it's not too much of a stretch to add the support.
as per [2], so when the support for such a client is added, wouldn't
((data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) || PROT_FLAG)
work?
I don't think that will work, because we currently have clients who use the
system cache as follows:
-cache only page tables in the system cache
-cache only data buffers in the system cache
-cache both page tables and all buffers in the system cache
-cache both page tables and some buffers in the system cache

The approach you're suggesting doesn't allow for the last case, as caching the
page tables in the system cache involves setting IO_PGTABLE_QUIRK_ARM_OUTER_WBWA,
so we will end up losing the flexibility to cache some data buffers in the system cache.

Ideally, the page table quirk would drive the settings for the TCR, and the prot flag
drives the PTE for the mapping, as is done with the page table walker being dma-coherent,
while buffers are mapped as cacheable based on IOMMU_CACHE. Thoughts?

Thanks,
Isaac

[1]
https://lore.kernel.org/dri-devel/ecfda7ca80f6d7b4ff3d89b8758f4dc9@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/linux-iommu/20191026053026.GA14545@xxxxxx/T/

Thanks,
Sai