Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

From: Oded Gabbay
Date: Mon Feb 06 2023 - 09:48:50 EST


On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <matthew.brost@xxxxxxxxx> wrote:
>
> On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote:
> > On 1/18/23 20:48, Christian König wrote:
> > > Am 18.01.23 um 20:17 schrieb Dave Airlie:
> > > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> > > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich
> > > > > <dakr@xxxxxxxxxx> wrote:
> > > > > >
> > > > > >
> > > > > > On 1/18/23 17:30, Alex Deucher wrote:
> > > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich
> > > > > > > <dakr@xxxxxxxxxx> wrote:
> > > > > > > > On 1/18/23 16:37, Christian König wrote:
> > > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > > > > > > > > > Hi Christian,
> > > > > > > > > >
> > > > > > > > > > On 1/18/23 09:53, Christian König wrote:
> > > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > > > > > > > > > > > This patch series provides a new UAPI for the Nouveau driver in
> > > > > > > > > > > > order to
> > > > > > > > > > > > support Vulkan features, such as
> > > > > > > > > > > > sparse bindings and sparse
> > > > > > > > > > > > residency.
> > > > > > > > > > > >
> > > > > > > > > > > > Furthermore, with the DRM GPUVA
> > > > > > > > > > > > manager it provides a new DRM core
> > > > > > > > > > > > feature to
> > > > > > > > > > > > keep track of GPU virtual address
> > > > > > > > > > > > (VA) mappings in a more generic way.
> > > > > > > > > > > >
> > > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement
> > > > > > > > > > > > userspace-manageable
> > > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve
> > > > > > > > > > > > this goal it
> > > > > > > > > > > > serves the following purposes in this context.
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Provide a dedicated range allocator to track GPU VA
> > > > > > > > > > > > allocations and
> > > > > > > > > > > > mappings, making use of the drm_mm range allocator.
> > > > > > > > > > > This means that the ranges are allocated
> > > > > > > > > > > by the kernel? If yes that's
> > > > > > > > > > > a really really bad idea.
> > > > > > > > > > No, it's just for keeping track of the
> > > > > > > > > > ranges userspace has allocated.
> > > > > > > > > Ok, that makes more sense.
> > > > > > > > >
> > > > > > > > > So basically you have an IOCTL which asks kernel
> > > > > > > > > for a free range? Or
> > > > > > > > > what exactly is the drm_mm used for here?
> > > > > > > > Not even that, userspace provides both the base
> > > > > > > > address and the range,
> > > > > > > > the kernel really just keeps track of things.
> > > > > > > > Though, writing a UAPI on
> > > > > > > > top of the GPUVA manager asking for a free range instead would be
> > > > > > > > possible by just adding the corresponding wrapper functions to get a
> > > > > > > > free hole.
> > > > > > > >
> > > > > > > > Currently, and that's what I think I read out of
> > > > > > > > your question, the main
> > > > > > > > benefit of using drm_mm over simply stuffing the
> > > > > > > > entries into a list or
> > > > > > > > something boils down to easier collision detection and iterating
> > > > > > > > sub-ranges of the whole VA space.
> > > > > > > Why not just do this in userspace? We have a range manager in
> > > > > > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > > > > > helper.
> > > > > > The kernel still needs to keep track of the mappings within the various
> > > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > > > > > BOs that get evicted and remap them once they're validated (or swapped
> > > > > > back in).
> > > > > Ok, you are just using this for maintaining the GPU VM space in
> > > > > the kernel.
> > > > >
> > > > Yes the idea behind having common code wrapping drm_mm for this is to
> > > > allow us to make the rules consistent across drivers.
> > > >
> > > > Userspace (generally Vulkan, some compute) has interfaces that pretty
> > > > much dictate a lot of how VMA tracking works, esp around lifetimes,
> > > > sparse mappings and splitting/merging underlying page tables, I'd
> > > > really like this to be more consistent across drivers, because already
> > > > I think we've seen with freedreno some divergence from amdgpu and we
> > > > also have i915/xe to deal with. I'd like to at least have one place
> > > > that we can say this is how it should work, since this is something
> > > > that *should* be consistent across drivers mostly, as it is more about
> > > > how the uapi is exposed.
> > >
> > > That's a really good idea, but the implementation with drm_mm won't work
> > > like that.
> > >
> > > We have Vulkan applications which use the sparse feature to create
> > > literally millions of mappings. That's why I have fine tuned the mapping
>
> Is this not an application issue? Millions of mappings seems a bit
> absurd to me.
If I look at the most extreme case for AI, assuming 256GB of HBM
memory and page mapping of 2MB, we get to 128K of mappings. But that's
really the extreme case imo. I assume most mappings will be much
larger. In fact, in the most realistic scenario of large-scale
training, a single user will probably map the entire HBM memory using
1GB pages.

I have also a question, could this GPUVA code manage VA ranges
mappings for userptr mappings, assuming we work without svm/uva/usm
(pointer-is-a-pointer) ? Because then we are talking about possible
4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
the assumption this can be used also for non-VK use-cases. Please tell
me if I'm totally wrong here).

Thanks,
Oded