Re: [PATCH] Revert "Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""

From: Doug Anderson
Date: Thu Jun 15 2023 - 18:08:23 EST


Hi,

On Thu, Jun 15, 2023 at 12:04 PM Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> > Presumably the fact that the firmware gets a physical address means
> > that the firmware needs to map this address somehow itself. I can try
> > to dig up what the firmware is doing if needed (what attributes it
> > uses to map), but I guess the hope is that it shouldn't matter.
>
> It absolutely matters. Linux has been told (by DT) that this device does
> not snoop caches, and therefore is acting on that information by using
> the non-cacheable remap. There is nothing inherently wrong with that,
> even when the "device" is actually firmware running on the same CPU -
> EL3 could well run with the MMU off, or just make a point of not
> accessing Non-Secure memory with cacheable attributes to avoid
> side-channels. However if in this case the SCM firmware *is* using
> cacheable attributes, as the symptoms would suggest, then either the
> firmware or the DT is wrong, and there is nothing Linux can do to
> completely fix that.

With help from minecrell on IRC, we've found that firmare _does_ map
it as cachable.


> > If this wild guessing is
> > correct, maybe a more correct solution would be to simply unmap the
> > memory from the kernel before passing the physical address to the
> > firmware, if that's possible...
>
> Having now looked at the SCM driver, TBH it doesn't make an awful lot of
> sense for it to be using dma_alloc_coherent() there anyway - it's not
> using it as a coherent buffer, it's doing a one-off unidirectional
> transfer of a relatively small amount of data in a manner which seems to
> be exactly the usage model for the streaming DMA API. And I think using
> the latter would happen to mitigate this problem too - with streaming
> DMA you'd put the dma_map_page() call *after* all the buffer data has
> been written, right before the SMC call, so even with a coherency
> mismatch there would essentially be no opportunity for the caches to get
> out of sync.

Switching to the streaming API for this function _does_ work, but for
now I'm not going to make this switch and instead going to go with the
fix to add "dma-coherent" [1]. That seems like the more correct fix
instead of just a mitigation. If someone wants to switch the SCM
driver to the streaming APIs, that'd be cool too. On IRC, minecrell
pointed out that at least one function in this file,
qcom_scm_ice_set_key(), purposely didn't use the streaming APIs.

[1] https://lore.kernel.org/r/20230615145253.1.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid

-Doug