[RFC PATCH] arm64: mm: Add invalidate back in arch_sync_dma_for_device when FROM_DEVICE

From: Nanyong Sun
Date: Thu Nov 17 2022 - 01:33:30 EST


The commit c50f11c6196f ("arm64: mm: Don't invalidate FROM_DEVICE
buffers at start of DMA transfer") replaces invalidate with clean
when DMA_FROM_DEVICE, this changes the behavior of functions like
dma_map_single() and dma_sync_single_for_device(*, *, *, DMA_FROM_DEVICE),
then it may make some drivers works unwell because the implementation
of these DMA APIs lose the original cache invalidation.

Situation 1:
We can see that a lot of drivers in mainline have called the
dma_sync_single_for_device(*, *, *, DMA_FROM_DEVICE) for sync,
they would get cache invalidated before implementation changed,
but now they got cache clean, which may violate the original
expectation of the drivers and result in errors.

Situation 2:
After backporting the above commit, we find a network card driver go
wrong with cache inconsistency when doing DMA transfer: CPU got the
stale data in cache when reading DMA data received from device.
A similar phenomenon happens on sata disk drivers, it involves
mainline modules like libata, scsi, ahci etc, and is hard to find
out which line of code results in the error.

It seems that some dirvers may go wrong and have to match the
implementation changes of the DMA APIs, and it would be confused
because the behavior of these DMA APIs on arm64 are different
from other archs.

Add invalidate back in arch_sync_dma_for_device() to keep drivers
compatible by replace dcache_clean_poc with dcache_clean_inval_poc
when DMA_FROM_DEVICE.

Fixes: c50f11c6196f ("arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer")
Signed-off-by: Nanyong Sun <sunnanyong@xxxxxxxxxx>
---
arch/arm64/mm/dma-mapping.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3cb101e8cb29..07f6a3089c64 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -18,7 +18,10 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
{
unsigned long start = (unsigned long)phys_to_virt(paddr);

- dcache_clean_poc(start, start + size);
+ if (dir == DMA_FROM_DEVICE)
+ dcache_clean_inval_poc(start, start + size);
+ else
+ dcache_clean_poc(start, start + size);
}

void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
--
2.25.1