Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead of drm_gem_shmem locks

From: Christian König
Date: Wed May 11 2022 - 10:24:46 EST


Am 11.05.22 um 15:00 schrieb Daniel Vetter:
On Tue, May 10, 2022 at 04:39:53PM +0300, Dmitry Osipenko wrote:
[SNIP]
Since vmapping implies implicit pinning, we can't use a separate lock in
drm_gem_shmem_vmap() because we need to protect the
drm_gem_shmem_get_pages(), which is invoked by drm_gem_shmem_vmap() to
pin the pages and requires the dma_resv_lock to be locked.

Hence the problem is:

1. If dma-buf importer holds the dma_resv_lock and invokes
dma_buf_vmap() -> drm_gem_shmem_vmap(), then drm_gem_shmem_vmap() shall
not take the dma_resv_lock.

2. Since dma-buf locking convention isn't specified, we can't assume
that dma-buf importer holds the dma_resv_lock around dma_buf_vmap().

The possible solutions are:

1. Specify the dma_resv_lock convention for dma-bufs and make all
drivers to follow it.

2. Make only DRM drivers to hold dma_resv_lock around dma_buf_vmap().
Other non-DRM drivers will get the lockdep warning.

3. Make drm_gem_shmem_vmap() to take the dma_resv_lock and get deadlock
if dma-buf importer holds the lock.

...
Yeah this is all very annoying.

Ah, yes that topic again :)

I think we could relatively easily fix that by just defining and enforcing that the dma_resv_lock must have be taken by the caller when dma_buf_vmap() is called.

A two step approach should work:
1. Move the call to dma_resv_lock() into the dma_buf_vmap() function and remove all lock taking from the vmap callback implementations.
2. Move the call to dma_resv_lock() into the callers of dma_buf_vmap() and enforce that the function is called with the lock held.

It shouldn't be that hard to clean up. The last time I looked into it my main problem was that we didn't had any easy unit test for it.

Regards,
Christian.


There are actually very few drivers in kernel that use dma_buf_vmap()
[1], so perhaps it's not really a big deal to first try to define the
locking and pinning convention for the dma-bufs? At least for
dma_buf_vmap()? Let me try to do this.

[1] https://elixir.bootlin.com/linux/v5.18-rc6/C/ident/dma_buf_vmap
Yeah looking through the code there's largely two classes of drivers that
need vmap:

- display drivers that need to do cpu upload (usb, spi, i2c displays).
Those generally set up the vmap at import time or when creating the
drm_framebuffer object (e.g. see
drm_gem_cma_prime_import_sg_table_vmap()), because that's really the
only place where you can safely do that without running into locking
inversion issues sooner or later

- lots of other drivers (and shmem helpers) seem to do dma_buf_vmap just
because they can, but only actually ever use vmap on native objects,
never on imported objects. Or at least I think so.

So maybe another approach here:

1. In general drivers which need a vmap need to set that up at dma_buf
import time - the same way we pin the buffers at import time for
non-dynamic importers because that's the only place where across all
drivers it's ok to just take dma_resv_lock.

2. We remove the "just because we can" dma_buf_vmap support from
helpers/drivers - the paths all already can cope with NULL since
dma_buf_vmap can fail. vmap will only work on native objects, not imported
ones.

3. If there is any driver using shmem helpers that absolutely needs vmap
to also work on imported it needs a special import function (like cma
helpers) which sets up the vmap at import time.

So since this is all very tricky ... what did I miss this time around?

I envision that the extra dma_resv_locks for dma-bufs potentially may
create unnecessary bottlenecks for some drivers if locking isn't really
necessary by a specific driver, so drivers will need to keep this in
mind. On the other hand, I don't think that any of the today's drivers
will notice the additional resv locks in practice.
Nah I don't think the extra locking will ever create a bottleneck,
especially not for vmap. Generally vmap is a fallback or at least cpu
operation, so at that point you're already going very slow.
-Daniel