Re: [PATCH 4/9] dmaengine: dw-edma: HDMA: Add memory barrier before starting the DMA transfer in remote setup

From: Serge Semin
Date: Mon Jun 19 2023 - 13:02:34 EST


On Fri, Jun 09, 2023 at 10:16:49AM +0200, Köry Maincent wrote:
> From: Kory Maincent <kory.maincent@xxxxxxxxxxx>
>

> The Linked list element and pointer are not stored in the same memory as
> the HDMA controller register. If the doorbell register is toggled before
> the full write of the linked list a race condition error can appears.
> In remote setup we can only use a readl to the memory to assured the full
> write has occurred.
>
> Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
> Signed-off-by: Kory Maincent <kory.maincent@xxxxxxxxxxx>

Is this a hypothetical bug? Have you actually experienced the
described problem? If so are you sure that it's supposed to be fixed
as you suggest?

I am asking because based on the kernel doc (Documentation/memory-barriers.txt):

* 1. All readX() and writeX() accesses to the same peripheral are ordered
* with respect to each other. This ensures that MMIO register accesses
* by the same CPU thread to a particular device will arrive in program
* order.
* ...
* The ordering properties of __iomem pointers obtained with non-default
* attributes (e.g. those returned by ioremap_wc()) are specific to the
* underlying architecture and therefore the guarantees listed above cannot
* generally be relied upon for accesses to these types of mappings.

the IOs performed by the accessors are supposed to arrive in the
program order. Thus SET_CH_32(..., HDMA_V0_DOORBELL_START) performed
after all the previous SET_CH_32(...) are finished looks correct with
no need in additional barriers. The results of the later operations
are supposed to be seen by the device (in our case it's a remote DW
eDMA controller) before the doorbell update from scratch. From that
perspective your problem looks as if the IO operations preceding the
doorbell CSR update aren't finished yet. So you are sure that the LL
memory is mapped with no additional flags like Write-Combine or some
caching optimizations? Are you sure that the PCIe IOs are correctly
implemented in your platform?

I do understand that the eDMA CSRs and the LL memory are mapped by
different BARs in the remote eDMA setup. But they still belong to the
same device. So the IO accessors semantic described in the kernel doc
implies no need in additional barrier.

-Serge(y)

> ---
>
> This patch is fixing a commit which is only in dmaengine tree and not
> merged mainline.
> ---
> drivers/dma/dw-edma/dw-hdma-v0-core.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/dma/dw-edma/dw-hdma-v0-core.c b/drivers/dma/dw-edma/dw-hdma-v0-core.c
> index 7bd1a0f742be..0b77ddbe91b5 100644
> --- a/drivers/dma/dw-edma/dw-hdma-v0-core.c
> +++ b/drivers/dma/dw-edma/dw-hdma-v0-core.c
> @@ -247,6 +247,15 @@ static void dw_hdma_v0_core_start(struct dw_edma_chunk *chunk, bool first)
> /* Set consumer cycle */
> SET_CH_32(dw, chan->dir, chan->id, cycle_sync,
> HDMA_V0_CONSUMER_CYCLE_STAT | HDMA_V0_CONSUMER_CYCLE_BIT);
> +
> + if (!(chunk->chan->dw->chip->flags & DW_EDMA_CHIP_LOCAL))
> + /* Make sure Linked List has been written.
> + * Linux memory barriers don't cater for what's required here.
> + * What's required is what's here - a read of the linked
> + * list region.
> + */
> + readl(chunk->ll_region.vaddr.io);
> +
> /* Doorbell */
> SET_CH_32(dw, chan->dir, chan->id, doorbell, HDMA_V0_DOORBELL_START);
> }
> --
> 2.25.1
>