RE: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

From: Z.Q. Hou
Date: Tue Apr 18 2023 - 05:56:48 EST


Hi Robin,

> -----Original Message-----
> From: Robin Murphy <robin.murphy@xxxxxxx>
> Sent: Monday, April 17, 2023 8:28 PM
> To: Z.Q. Hou <zhiqiang.hou@xxxxxxx>; Christoph Hellwig <hch@xxxxxx>
> Cc: iommu@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> m.szyprowski@xxxxxxxxxxx
> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> property
>
> On 2023-04-17 03:06, Z.Q. Hou wrote:
> > Hi Christoph,
> >
> >> -----Original Message-----
> >> From: Christoph Hellwig <hch@xxxxxx>
> >> Sent: Sunday, April 16, 2023 2:30 PM
> >> To: Z.Q. Hou <zhiqiang.hou@xxxxxxx>
> >> Cc: iommu@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; hch@xxxxxx;
> >> m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx
> >> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> >> property
> >>
> >> On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
> >>> From: Hou Zhiqiang <Zhiqiang.Hou@xxxxxxx>
> >>>
> >>> Currently, the coherent DMA memory is always mapped as writecombine
> >>> and uncached, ignored the 'dma-coherent' property in device node,
> >>> this patch is to map the memory as writeback and cached when the
> >>> device has 'dma-coherent' property.
> >>
> >> What is the use case here? The somewhat misnamed per-device coherent
> >> memory is intended for small per-device pools of sram or such used
> >> for staging memory.
> >
> > In my case, there are multiple Cortex-A cores within the cluster, in
> > which it is cache coherent, they are split into 2 island for running Linux and
> RTOS respectively.
> > I created a virtual device for Linux and RTOS communication using shared
> memory.
> > In Linux side, I created a per-device dma memory pool and added
> 'dma-coherent'
> > for the virtual device, but the data in shared memory can't be sync
> > up, finally found the per-device dma pool is always mapped as uncached, so
> submitted this fix patch.
>
> Yes, in principle this should apply similarly to restricted DMA or confidential
> compute VMs where DMA buffers are to be allocated from a predetermined
> shared memory area, and a DT reserved-memory region is used as a coherent
> pool to achieve that. Quite likely that so far this has only been done with
> non-coherent hardware or in software models where a mismatch in nominal
> cacheability wasn't noticeable.
>
> It's a bit niche, but not entirely unreasonable.
>

Understand, this change doesn't affect the ones without 'dma-coherent', and it can improve the performance leveraging the hardware cache coherent feature.
And in the CMA, it maps the memory as cacheable when the device node has 'dma-coherent', otherwise non-cacheable.
So this change aligns the behavior of the per-device dma pool to the CMA.

Thanks,
Zhiqiang