Re: [PATCH 1/2] drm: add cache support for arm64

From: Rob Clark
Date: Tue Aug 06 2019 - 10:11:55 EST


On Tue, Aug 6, 2019 at 1:48 AM Christoph Hellwig <hch@xxxxxx> wrote:
>
> This goes in the wrong direction. drm_cflush_* are a bad API we need to
> get rid of, not add use of it. The reason for that is two-fold:
>
> a) it doesn't address how cache maintaince actually works in most
> platforms. When talking about a cache we three fundamental operations:
>
> 1) write back - this writes the content of the cache back to the
> backing memory
> 2) invalidate - this remove the content of the cache
> 3) write back + invalidate - do both of the above

Agreed that drm_cflush_* isn't a great API. In this particular case
(IIUC), I need wb+inv so that there aren't dirty cache lines that drop
out to memory later, and so that I don't get a cache hit on
uncached/wc mmap'ing.

> b) which of the above operation you use when depends on a couple of
> factors of what you want to do with the range you do the cache
> maintainance operations
>
> Take a look at the comment in arch/arc/mm/dma.c around line 30 that
> explains how this applies to buffer ownership management. Note that
> "for device" applies to "for userspace" in the same way, just that
> userspace then also needs to follow this protocol. So the whole idea
> that random driver code calls random low-level cache maintainance
> operations (and use the non-specific term flush to make it all more
> confusing) is a bad idea. Fortunately enough we have really good
> arch helpers for all non-coherent architectures (this excludes the
> magic i915 won't be covered by that, but that is a separate issue
> to be addressed later, and the fact that while arm32 did grew them
> very recently and doesn't expose them for all configs, which is easily
> fixable if needed) with arch_sync_dma_for_device and
> arch_sync_dma_for_cpu. So what we need is to figure out where we
> have valid cases for buffer ownership transfer outside the DMA
> API, and build proper wrappers around the above function for that.
> My guess is it should probably be build to go with the iommu API
> as that is the only other way to map memory for DMA access, but
> if you have a better idea I'd be open to discussion.

Tying it in w/ iommu seems a bit weird to me.. but maybe that is just
me, I'm certainly willing to consider proposals or to try things and
see how they work out.

Exposing the arch_sync_* API and using that directly (bypassing
drm_cflush_*) actually seems pretty reasonable and pragmatic. I did
have one doubt, as phys_to_virt() is only valid for kernel direct
mapped memory (AFAIU), what happens for pages that are not in kernel
linear map? Maybe it is ok to ignore those pages, since they won't
have an aliased mapping?

BR,
-R