Re: using DMA-API on ARM

From: Catalin Marinas
Date: Fri Dec 05 2014 - 13:29:14 EST


On Fri, Dec 05, 2014 at 03:06:48PM +0000, Russell King - ARM Linux wrote:
> I've been doing more digging into the current DMA code, and I'm dismayed
> to see that there's new bugs in it...
>
> commit 513510ddba9650fc7da456eefeb0ead7632324f6
> Author: Laura Abbott <lauraa@xxxxxxxxxxxxxx>
> Date: Thu Oct 9 15:26:40 2014 -0700
>
> common: dma-mapping: introduce common remapping functions
>
> This uses map_vm_area() to achieve the remapping of pages allocated inside
> dma_alloc_coherent(). dma_alloc_coherent() is documented in a rather
> round-about way in Documentation/DMA-API.txt:
>
> | Part Ia - Using large DMA-coherent buffers
> | ------------------------------------------
> |
> | void *
> | dma_alloc_coherent(struct device *dev, size_t size,
> | dma_addr_t *dma_handle, gfp_t flag)
> |
> | void
> | dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
> | dma_addr_t dma_handle)
> |
> | Free a region of consistent memory you previously allocated. dev,
> | size and dma_handle must all be the same as those passed into
> | dma_alloc_coherent(). cpu_addr must be the virtual address returned by
> | the dma_alloc_coherent().
> |
> | Note that unlike their sibling allocation calls, these routines
> | may only be called with IRQs enabled.
>
> Note that very last paragraph. What this says is that it is explicitly
> permitted to call dma_alloc_coherent() with IRQs disabled.

This is solved by using a pre-allocated, pre-mapped atomic_pool which
avoids any further mapping. __dma_alloc() calls __alloc_from_pool() when
!__GFP_WAIT.

This code got pretty complex and we may find bugs. It can be simplified
by a pre-allocated non-cacheable region that is safe in atomic context
(how big you allocate this is hard to say).

> If the problem which you (Broadcom) are suffering from is down to the
> issue I suspect (that being having mappings with different cache
> attributes) then I'm not sure that there's anything we can realistically
> do about that. There's a number of issues which make it hard to see a
> way forward.

I'm still puzzled by this problem, so I don't have any suggestion yet. I
wouldn't blame the mismatched attributes yet as I haven't seen such
problem in practice (but you never know).

How does the DT describe this device? Could it have some dma-coherent
property in there that causes dma_alloc_coherent() to create a cacheable
memory?

The reverse could also cause problems: the device is coherent but the
CPU creates a non-cacheable mapping.

--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/