Re: [PATCH] Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"

From: Sibi Sankar
Date: Mon Nov 21 2022 - 05:13:05 EST




On 11/21/22 12:12, Manivannan Sadhasivam wrote:
On Fri, Nov 18, 2022 at 12:33:49PM +0000, Will Deacon wrote:
On Fri, Nov 18, 2022 at 04:24:02PM +0530, Manivannan Sadhasivam wrote:
On Mon, Nov 14, 2022 at 05:38:00PM +0000, Catalin Marinas wrote:
On Mon, Nov 14, 2022 at 03:14:21PM +0000, Robin Murphy wrote:
On 2022-11-14 14:11, Will Deacon wrote:
On Mon, Nov 14, 2022 at 04:33:29PM +0530, Manivannan Sadhasivam wrote:
This reverts commit c44094eee32f32f175aadc0efcac449d99b1bbf7.

As reported by Amit [1], dropping cache invalidation from
arch_dma_prep_coherent() triggers a crash on the Qualcomm SM8250 platform
(most probably on other Qcom platforms too). The reason is, Qcom
qcom_q6v5_mss driver copies the firmware metadata and shares it with modem
for validation. The modem has a secure block (XPU) that will trigger a
whole system crash if the shared memory is accessed by the CPU while modem
is poking at it.

To avoid this issue, the qcom_q6v5_mss driver allocates a chunk of memory
with no kernel mapping, vmap's it, copies the firmware metadata and
unvmap's it. Finally the address is then shared with modem for metadata
validation [2].

Now because of the removal of cache invalidation from
arch_dma_prep_coherent(), there will be cache lines associated with this
memory even after sharing with modem. So when the CPU accesses it, the XPU
violation gets triggered.

This last past is a non-sequitur: the buffer is no longer mapped on the CPU
side, so how would the CPU access it?

Right, for the previous change to have made a difference the offending part
of this buffer must be present in some cache somewhere *before* the DMA
buffer allocation completes.

Clearly that driver is completely broken though. If the DMA allocation came
from a no-map carveout vma_dma_alloc_from_dev_coherent() then the vmap()
shenanigans wouldn't work, so if it backed by struct pages then the whole
dance is still pointless because *a cacheable linear mapping exists*, and
it's just relying on the reduced chance that anything's going to re-fetch
the linear map address after those pages have been allocated, exactly as I
called out previously[1].

So I guess a DMA pool that's not mapped in the linear map, together with
memremap() instead of vmap(), would work around the issue. But the
driver needs fixing, not the arch code.


Okay, thanks for the hint. Can you share how to allocate the dma-pool that's
not part of the kernel's linear map? I looked into it but couldn't find a way.

The no-map property should take care of this iirc


Yeah, we have been using it in other places of the same driver. But as per
Sibi, we used dynamic allocation for metadata validation since there was no
memory reserved statically for that.

Will,

Unlike the other portions in the driver that required statically defined no-map carveouts, metadata just needed a contiguous memory for authentication. Re-using existing carveouts for this metadata region
may not work due to modem FW limitations and declaring a new carveout for metadata will break the device tree bindings. That's the reason for
using DMA_ATTR_NO_KERNEL_MAPPING for dma_alloc_attr and vmpa/vunmap with
VM_FLUSH_RESET_PERMS before passing the memory onto modem. Are there other suggestions for achieving the same without breaking bindings?

- Sibi


But if we do not have a way to allocate a dynamic memory that is not part of
kernel's linear map, then we may have to resort to using an existing reserved
memory.

Thanks,
Mani

Will