Re: [PATCH 4/9] dmaengine: dw-edma: HDMA: Add memory barrier before starting the DMA transfer in remote setup

From: Köry Maincent
Date: Thu Jun 22 2023 - 11:12:32 EST


On Wed, 21 Jun 2023 18:56:49 +0300
Serge Semin <fancer.lancer@xxxxxxxxx> wrote:

> > Thanks for you detailed answer, this was instructive.
> > I will come back with more information if TLP flags are set.
> > FYI the PCIe board I am currently working with is the one from Brainchip:
> > Here is the driver:
> > https://github.com/Brainchip-Inc/akida_dw_edma
>
> I've glanced at the driver a bit:
>
> 1. Nothing criminal I've noticed in the way the BARs are mapped. It's
> done as it's normally done. pcim_iomap_regions() is supposed to map
> with no additional optimization. So the caching seems irrelevant
> in this case.
>
> 2. The probe() method performs some device iATU config:
> akida_1500_setup_iatu() and akida_1000_setup_iatu(). I would have a
> closer look at the way the inbound MWs setup is done.
>
> 3. akida_1000_iatu_conf_table contains comments about the APB bus. If
> it's an internal device bus and both LPDDR and eDMA are accessible
> over the same bus, then the re-ordering may happen there. If APB means
> the well known Advanced Peripheral Bus, then it's a quite slow bus
> with respect to the system interconnect and PCIe buses. If eDMA regs
> and LL-memory buses are different then the last write to the LL-memory
> might be indeed still pending while the doorbells update arrives.
> Sending a dummy read to the LL-memory stalls the program execution
> until a response arrive (PCIe MRd TLPs are non-posted - "send and wait
> for response") which happens only after the last write to the
> LL-memory finishes. That's probably why your fix with the dummy-read
> works and why the delay you noticed is quite significant (4us).
> Though it looks quite strange to put LPDDR on such slow bus.
>
> 4. I would have also had a closer look at the way the outbound MW is
> configured in your PCIe host controller (whether it enables some
> optimizations like Relaxed ordering and ID-based ordering).
>
> In anyway I would have got in touch with the FPGA designers whether
> any of my suppositions correct (especially regarding 3.).

Alright, thanks for your instructive review!

In the HDMA driver point of view we can not know if the eDMA regs and the
LL-memory will be in same bus in whatever future implementation. Of course it
is the hardware designers who should be careful about having a fast bus and
memory for the LL, but wouldn't it be more cautious to have this read?
Just a small thought!

Köry