Re: [PATCH] Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"

From: Manivannan Sadhasivam
Date: Mon Nov 21 2022 - 01:42:50 EST


On Fri, Nov 18, 2022 at 12:33:49PM +0000, Will Deacon wrote:
> On Fri, Nov 18, 2022 at 04:24:02PM +0530, Manivannan Sadhasivam wrote:
> > On Mon, Nov 14, 2022 at 05:38:00PM +0000, Catalin Marinas wrote:
> > > On Mon, Nov 14, 2022 at 03:14:21PM +0000, Robin Murphy wrote:
> > > > On 2022-11-14 14:11, Will Deacon wrote:
> > > > > On Mon, Nov 14, 2022 at 04:33:29PM +0530, Manivannan Sadhasivam wrote:
> > > > > > This reverts commit c44094eee32f32f175aadc0efcac449d99b1bbf7.
> > > > > >
> > > > > > As reported by Amit [1], dropping cache invalidation from
> > > > > > arch_dma_prep_coherent() triggers a crash on the Qualcomm SM8250 platform
> > > > > > (most probably on other Qcom platforms too). The reason is, Qcom
> > > > > > qcom_q6v5_mss driver copies the firmware metadata and shares it with modem
> > > > > > for validation. The modem has a secure block (XPU) that will trigger a
> > > > > > whole system crash if the shared memory is accessed by the CPU while modem
> > > > > > is poking at it.
> > > > > >
> > > > > > To avoid this issue, the qcom_q6v5_mss driver allocates a chunk of memory
> > > > > > with no kernel mapping, vmap's it, copies the firmware metadata and
> > > > > > unvmap's it. Finally the address is then shared with modem for metadata
> > > > > > validation [2].
> > > > > >
> > > > > > Now because of the removal of cache invalidation from
> > > > > > arch_dma_prep_coherent(), there will be cache lines associated with this
> > > > > > memory even after sharing with modem. So when the CPU accesses it, the XPU
> > > > > > violation gets triggered.
> > > > >
> > > > > This last past is a non-sequitur: the buffer is no longer mapped on the CPU
> > > > > side, so how would the CPU access it?
> > > >
> > > > Right, for the previous change to have made a difference the offending part
> > > > of this buffer must be present in some cache somewhere *before* the DMA
> > > > buffer allocation completes.
> > > >
> > > > Clearly that driver is completely broken though. If the DMA allocation came
> > > > from a no-map carveout vma_dma_alloc_from_dev_coherent() then the vmap()
> > > > shenanigans wouldn't work, so if it backed by struct pages then the whole
> > > > dance is still pointless because *a cacheable linear mapping exists*, and
> > > > it's just relying on the reduced chance that anything's going to re-fetch
> > > > the linear map address after those pages have been allocated, exactly as I
> > > > called out previously[1].
> > >
> > > So I guess a DMA pool that's not mapped in the linear map, together with
> > > memremap() instead of vmap(), would work around the issue. But the
> > > driver needs fixing, not the arch code.
> > >
> >
> > Okay, thanks for the hint. Can you share how to allocate the dma-pool that's
> > not part of the kernel's linear map? I looked into it but couldn't find a way.
>
> The no-map property should take care of this iirc
>

Yeah, we have been using it in other places of the same driver. But as per
Sibi, we used dynamic allocation for metadata validation since there was no
memory reserved statically for that.

But if we do not have a way to allocate a dynamic memory that is not part of
kernel's linear map, then we may have to resort to using an existing reserved
memory.

Thanks,
Mani

> Will

--
மணிவண்ணன் சதாசிவம்