Re: [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings

From: Danilo Krummrich
Date: Thu Jun 22 2023 - 11:07:07 EST


On 6/22/23 16:42, Christian König wrote:
Am 22.06.23 um 16:22 schrieb Danilo Krummrich:
On 6/22/23 15:54, Christian König wrote:
Am 20.06.23 um 14:23 schrieb Danilo Krummrich:
Hi Christian,

On 6/20/23 08:45, Christian König wrote:
Hi Danilo,

sorry for the delayed reply. I've trying to dig myself out of a hole at the moment.

No worries, thank you for taking a look anyway!


Am 20.06.23 um 02:42 schrieb Danilo Krummrich:
[SNIP]
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bbc721870c13..5ec8148a30ee 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -36,6 +36,8 @@
  #include <linux/kref.h>
  #include <linux/dma-resv.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
  #include <drm/drm_vma_manager.h>
@@ -379,6 +381,18 @@ struct drm_gem_object {
       */
      struct dma_resv _resv;
+    /**
+     * @gpuva:
+     *
+     * Provides the list of GPU VAs attached to this GEM object.
+     *
+     * Drivers should lock list accesses with the GEMs &dma_resv lock
+     * (&drm_gem_object.resv).
+     */
+    struct {
+        struct list_head list;
+    } gpuva;
+
      /**
       * @funcs:
       *

I'm pretty sure that it's not a good idea to attach this directly to the GEM object.

Why do you think so? IMHO having a common way to connect mappings to their backing buffers is a good thing, since every driver needs this connection anyway.

E.g. when a BO gets evicted, drivers can just iterate the list of mappings and, as the circumstances require, invalidate the corresponding mappings or to unmap all existing mappings of a given buffer.

What would be the advantage to let every driver implement a driver specific way of keeping this connection?

Flexibility. For example on amdgpu the mappings of a BO are groups by VM address spaces.

E.g. the BO points to multiple bo_vm structures which in turn have lists of their mappings.

Isn't this (almost) the same relationship I introduce with the GPUVA manager?

If you would switch over to the GPUVA manager right now, it would be that every GEM has a list of it's mappings (the gpuva list). The mapping is represented by struct drm_gpuva (of course embedded in driver specific structure(s)) which has a pointer to the VM address space it is part of, namely the GPUVA manager instance. And the GPUVA manager keeps a maple tree of it's mappings as well.

If you still would like to *directly* (indirectly you already have that relationship) keep a list of GPUVA managers (VM address spaces) per GEM, you could still do that in a driver specific way.

Do I miss something?

How do you efficiently find only the mappings of a BO in one VM?

Actually, I think this case should even be more efficient than with a BO having a list of GPUVAs (or mappings):

Having a list of GPUVAs per GEM, each GPUVA has a pointer to it's VM. Hence, you'd only need to iterate the list of mappings for a given BO and check the mappings VM pointer.

Having a list of VMs per BO, you'd have to iterate the whole VM to find the mappings having a pointer to the given BO, right?

I'd think that a single VM potentially has more mapping entries than a single BO was mapped in multiple VMs.

Another case to consider is the case I originally had in mind choosing this relationship: finding all mappings for a given BO, which I guess all drivers need to do in order to invalidate mappings on BO eviction.

Having a list of VMs per BO, wouldn't you need to iterate all of the VMs entirely?


Keep in mind that we have cases where one BO is shared with hundreds of different VMs as well as potentially the number of mappings can be >10k.



Additional to that there is a state maschine associated with the mappings, e.g. if the page tables are up to date or need to be updated etc....

Do you see cases where this kind of connection between mappings and backing buffers wouldn't be good enough? If so, which cases do you have in mind? Maybe we can cover them in a common way as well?

Yeah, we have tons of cases like that. But I have no idea how to generalize them.

They could still remain to be driver specific then, right?

Well does the mapping has a back pointer to the BO? And can that be optional NULL if there is no BO?

Yes to both.

- Danilo


Regards,
Christian.





As you wrote in the commit message it's highly driver specific what to map and where to map it.

In the end the common case should be that in a VA space at least every mapping being backed by a BO is represented by a struct drm_gpuva.

Oh, no! We also have mappings not backed by a BO at all! For example for partial resident textures or data routing to internal hw etc...

You can't be sure that a mapping is backed by a BO at all.

I fully agree, that's why I wrote "the common case should be that in a VA space at least every mapping *being backed by a BO* is represented by a struct drm_gpuva".

Mappings not being backed by an actual BO would not be linked to a GEM of course.




Instead I suggest to have a separate structure for mappings in a VA space which driver can then add to their GEM objects or whatever they want to map into their VMs.

Which kind of separate structure for mappings? Another one analogous to struct drm_gpuva?

Well similar to what amdgpu uses BO -> one structure for each combination of BO and VM -> mappings inside that VM

As explained above, I think that's exactly what the GPUVA manager does, just in another order:

BO has list of mappings, mappings have pointer to VM, VM has list (or actually a maple tree) of mappings.

You see any advantages or disadvantages of either order of relationships? For me it looks like it doesn't really matter which one to pick.

- Danilo


Regards,
Christian.


- Danilo


Regards,
Christian.