Re: [PATCH drm-misc-next v4 0/8] [RFC] DRM GPUVA Manager GPU-VM features

From: Boris Brezillon
Date: Thu Sep 28 2023 - 08:09:23 EST


On Wed, 20 Sep 2023 16:42:33 +0200
Danilo Krummrich <dakr@xxxxxxxxxx> wrote:

> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> allocations and mappings, generically connect GPU VA mappings to their
> backing buffers and perform more complex mapping operations on the GPU VA
> space.
>
> However, there are more design patterns commonly used by drivers, which
> can potentially be generalized in order to make the DRM GPUVA manager
> represent a basic GPU-VM implementation. In this context, this patch series
> aims at generalizing the following elements.
>
> 1) Provide a common dma-resv for GEM objects not being used outside of
> this GPU-VM.
>
> 2) Provide tracking of external GEM objects (GEM objects which are
> shared with other GPU-VMs).
>
> 3) Provide functions to efficiently lock all GEM objects dma-resv the
> GPU-VM contains mappings of.
>
> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> of, such that validation of evicted GEM objects is accelerated.
>
> 5) Provide some convinience functions for common patterns.
>
> The implementation introduces struct drm_gpuvm_bo, which serves as abstraction
> combining a struct drm_gpuvm and struct drm_gem_object, similar to what
> amdgpu does with struct amdgpu_bo_vm. While this adds a bit of complexity it
> improves the efficiency of tracking external and evicted GEM objects.
>
> This patch series also renames struct drm_gpuva_manager to struct drm_gpuvm
> including corresponding functions. This way the GPUVA manager's structures align
> better with the documentation of VM_BIND [1] and VM_BIND locking [2]. It also
> provides a better foundation for the naming of data structures and functions
> introduced for implementing the features of this patch series.
>
> This patch series is also available at [3].
>
> [1] Documentation/gpu/drm-vm-bind-async.rst
> [2] Documentation/gpu/drm-vm-bind-locking.rst
> [3] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/gpuvm-next
>
> Changes in V2:
> ==============
> - rename 'drm_gpuva_manager' -> 'drm_gpuvm' which generally leads to more
> consistent naming
> - properly separate commits (introduce common dma-resv, drm_gpuvm_bo
> abstraction, etc.)
> - remove maple tree for tracking external objects, use a list drm_gpuvm_bos
> per drm_gpuvm instead
> - rework dma-resv locking helpers (Thomas)
> - add a locking helper for a given range of the VA space (Christian)
> - make the GPUVA manager buildable as module, rather than drm_exec
> builtin (Christian)
>
> Changes in V3:
> ==============
> - rename missing function and files (Boris)
> - warn if vm_obj->obj != obj in drm_gpuva_link() (Boris)
> - don't expose drm_gpuvm_bo_destroy() (Boris)
> - unlink VM_BO from GEM in drm_gpuvm_bo_destroy() rather than
> drm_gpuva_unlink() and link within drm_gpuvm_bo_obtain() to keep
> drm_gpuvm_bo instances unique
> - add internal locking to external and evicted object lists to support drivers
> updating the VA space from within the fence signalling critical path (Boris)
> - unlink external objects and evicted objects from the GPUVM's list in
> drm_gpuvm_bo_destroy()
> - add more documentation and fix some kernel doc issues
>
> Changes in V4:
> ==============
> - add a drm_gpuvm_resv() helper (Boris)
> - add a drm_gpuvm::<list_name>::local_list field (Boris)
> - remove drm_gpuvm_bo_get_unless_zero() helper (Boris)
> - fix missing NULL assignment in get_next_vm_bo_from_list() (Boris)
> - keep a drm_gem_object reference on potential vm_bo destroy (alternatively we
> could free the vm_bo and drop the vm_bo's drm_gem_object reference through
> async work)
> - introduce DRM_GPUVM_RESV_PROTECTED flag to indicate external locking through
> the corresponding dma-resv locks to optimize for drivers already holding
> them when needed; add the corresponding lock_assert_held() calls (Thomas)
> - make drm_gpuvm_bo_evict() per vm_bo and add a drm_gpuvm_bo_gem_evict()
> helper (Thomas)
> - pass a drm_gpuvm_bo in drm_gpuvm_ops::vm_bo_validate() (Thomas)
> - documentation fixes
>
> Danilo Krummrich (8):
> drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
> drm/gpuvm: allow building as module
> drm/nouveau: uvmm: rename 'umgr' to 'base'
> drm/gpuvm: add common dma-resv per struct drm_gpuvm
> drm/gpuvm: add an abstraction for a VM / BO combination
> drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
> drm/gpuvm: generalize dma_resv/extobj handling and GEM validation

Tested-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>

> drm/nouveau: GPUVM dma-resv/extobj handling, GEM validation
>
> drivers/gpu/drm/Kconfig | 7 +
> drivers/gpu/drm/Makefile | 2 +-
> drivers/gpu/drm/drm_debugfs.c | 16 +-
> drivers/gpu/drm/drm_gpuva_mgr.c | 1725 --------------
> drivers/gpu/drm/drm_gpuvm.c | 2600 +++++++++++++++++++++
> drivers/gpu/drm/nouveau/Kconfig | 1 +
> drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_exec.c | 52 +-
> drivers/gpu/drm/nouveau/nouveau_exec.h | 4 -
> drivers/gpu/drm/nouveau/nouveau_gem.c | 5 +-
> drivers/gpu/drm/nouveau/nouveau_sched.h | 4 +-
> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 207 +-
> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 8 +-
> include/drm/drm_debugfs.h | 6 +-
> include/drm/drm_gem.h | 32 +-
> include/drm/drm_gpuva_mgr.h | 706 ------
> include/drm/drm_gpuvm.h | 1142 +++++++++
> 18 files changed, 3934 insertions(+), 2589 deletions(-)
> delete mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> create mode 100644 drivers/gpu/drm/drm_gpuvm.c
> delete mode 100644 include/drm/drm_gpuva_mgr.h
> create mode 100644 include/drm/drm_gpuvm.h
>
>
> base-commit: 1c7a387ffef894b1ab3942f0482dac7a6e0a909c