Re: [PATCH 6/7] k3dma: Fix occasional DMA ERR issue by using proper dma api

From: zhangfei
Date: Thu Jul 21 2016 - 12:08:50 EST




On 07/21/2016 01:22 PM, John Stultz wrote:
On Wed, Jul 20, 2016 at 9:26 PM, zhangfei <zhangfei.gao@xxxxxxxxxx> wrote:


On 07/21/2016 11:53 AM, John Stultz wrote:

After lots of debugging on an occasional DMA ERR issue, I realized
that the desc structures which we point the dma hardware are being
allocated out of regular memory. This means when we fill the desc
structures, that data doesn't always get flushed out to memory by
the time we start the dma transfer, resulting in the dma engine getting
some null values, resulting in a DMA ERR on the first irq.


How about using wmb() flush before start dma to sync desc?

So I'm not going to pretend to be an expert here, but my understanding
is that wmb() syncrhonizes cpu write ordering operations across cpus,
so the cpus see all the changes before the wmb() before they see any
changes after. But I'm not sure what effect wmb() has across cpu
cache to device ordering. I don't think it works as a cache flush to
memory.

Andy's patch introducing the cyclic support actually had a wmb() in it
that I removed as I couldn't understand clearly why it was there (and
there wasn't a comment explaining, as required by checkpatch :). But
even with that wmb(), the DMA ERR was still seen.

Only with these two new changes have I gotten to the point where I
can't seem to trigger the DMA error.


Yes, you are right.
Have double checked, we have to use non-cached memory here as dma descriptor, instead of cached memory from kzalloc.

And barrier (wmb or writel) is used to ensure descriptor are written before start dma.
Though we start dma much later in issue_pending -> tasklet, so the chance is low.

Thanks