On Thu, 16 Nov 2023 14:53:50 +0100
Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote:
On Thu, 2023-11-16 at 14:27 +0100, Boris Brezillon wrote:Sure, I have no problem with that.
On Thu, 16 Nov 2023 12:48:45 +0100Sure, Although I think initial merge will be as is, and then merging
Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote:
Hi, Boris,Well, I'd suggest that the one doing changes to drm_gpuvm gets to
Thanks for reviewing. Some comments below:
On Thu, 2023-11-16 at 10:48 +0100, Boris Brezillon wrote:
Hi Thomas,Yes, I can try to be a bit more drm_gpuvm-centric throughout the
On Wed, 15 Nov 2023 13:49:37 +0100
Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote:
Add the first version of the VM_BIND locking document which isShould we mention that it's a driver object, likely inheriting
intended to be part of the xe driver upstreaming agreement.
The document describes and discuss the locking used during
exec-
functions, evicton and for userptr gpu-vmas. Intention is to be
using the
same nomenclature as the drm-vm-bind-async.rst.
v2:
- s/gvm/gpu_vm/g (Rodrigo Vivi)
- Clarify the userptr seqlock with a pointer to
mm/mmu_notifier.c
(Rodrigo Vivi)
- Adjust commit message accordingly.
- Add SPDX license header.
v3:
- Large update to align with the drm_gpuvm manager locking
- Add "Efficient userptr gpu_vma exec function iteration"
section
- Add "Locking at bind- and unbind time" section.
v4:
- Fix tabs vs space errors by untabifying (Rodrigo Vivi)
- Minor style fixes and typos (Rodrigo Vivi)
- Clarify situations where stale GPU mappings are occurring and
how
access through these mappings are blocked. (Rodrigo Vivi)
- Insert into the toctree in implementation_guidelines.rst
Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
Signed-off-by: Thomas Hellström
<thomas.hellstrom@xxxxxxxxxxxxxxx>
---
Documentation/gpu/drm-vm-bind-locking.rst | 503
++++++++++++++++++
.../gpu/implementation_guidelines.rst | 1 +
2 files changed, 504 insertions(+)
create mode 100644 Documentation/gpu/drm-vm-bind-locking.rst
diff --git a/Documentation/gpu/drm-vm-bind-locking.rst
b/Documentation/gpu/drm-vm-bind-locking.rst
new file mode 100644
index 000000000000..bc701157cb34
--- /dev/null
+++ b/Documentation/gpu/drm-vm-bind-locking.rst
@@ -0,0 +1,503 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+===============
+VM_BIND locking
+===============
+
+This document attempts to describe what's needed to get
VM_BIND
locking right,
+including the userptr mmu_notifier locking and it will also
discuss some
+optimizations to get rid of the looping through of all userptr
mappings and
+external / shared object mappings that is needed in the
simplest
+implementation. It will also discuss some implications for
faulting gpu_vms.
+
+Nomenclature
+============
+
+* ``Context``: GPU execution context.
+* ``gpu_vm``: Abstraction of a virtual GPU address space with
+ meta-data. Typically one per client (DRM file-private), or
one
per
+ context.
from
drm_gpuvm?
document, although I want to avoid being too specific due to the
current rapid drm_gpuvm rate of change, which might also affect
this
document. I guess I will have to commit for now to update the
document
on each gpuvm series we land...
update this document along the way, or even better, make this
documentation part of the drm_gpuvm doc, so there's no excuse to not
update it when drm_gpuvm is extended.
with drm_gpuvm will come at a later point.
Okay, but then I don't see the problem if you leave such vm_bos in theYes this section here is the key: The vmas belonging to evicted bos notIIUC, the partial VMA restore is about not restoring all VMAsThis is the use-case Christian has been flagging for for OpenGL and+ add_dma_fence(&obj->resv, job_dma_fence);This last sentence is hard to follow.
+
+ dma_resv_unlock(&obj->resv);
+ put_object(obj);
+
+Note that since the object is local to the gpu_vm, it will
share
the gpu_vm's
+dma_resv lock so that ``obj->resv == gpu_vm->resv``.
+The gpu_vm_bos marked for eviction are put on the gpu_vm's
evict
list,
+which is protected by ``gpu_vm->resv``, that is always locked
while
+evicting, due to the above equality.
+
+For VM_BIND gpu_vms, gpu_vmas don't need to be unbound before
eviction,
+Since the driver must ensure that the eviction blit or copy
will
wait
+for GPU idle or depend on all previous GPU activity.
Furthermore,
any
+subsequent attempt by the GPU to access freed memory through
the
+gpu_vma will be preceded by a new exec function, with a
revalidation
+section which will make sure all gpu_vmas are rebound. The
eviction
+code holding the object's dma_resv while revalidating will
ensure
a
+new exec function may not race with the eviction. Note that
this
will
+not hold true, however, if only a subsets of vmas are, due to
the
+driver implementation, selected for rebinding the next exec
+function.
"
Note that this will not hold true if only a subset of vmas
are selected for rebinding during the next exec call (for
instance,
due
to some driver decision to only partially restore VMAs).
"
Then all vmas *not* selected for rebinding needs to beI think I get the problem, but can we have a use case where
+properly unbound before re-enabling GPU access to the VM.
partial
VMA restoration is useful? I mean, if some VMAs are not needed,
we
might be able to save MMU page table allocation/setup-time, but
given
the mess it then is to track those non-live VMAs, I'm wondering
if
it's
leaving the door open for that, unless there's a good reason to
do
it.
Media where he warns that the single-vm-memory-overcommit case
would
otherwise make the app crawl.
attached
to a vm_bo, but as soon as you restore one, this makes the BO
resident,
so all you're saving here is the extra page table for non-restored
VMAs.
I don't think that would significantly help the overcommit use case,
unless you have so many VMAs attached to a single vm_bo that the
amount of extra page tables becomes non-negligible compared to the BO
size itself.
What would really help the overcommit use case is not restoring all
evicted BOs, if some of them are known to be in a range that's not
accessed by a GPU job. In that situation, you can decide to leave
vm_bos in the evicted list if none of their VMAs overlap any of the
VM
ranges used by a job.
restored would be the ones not "selected for rebind".
evicted list. Those will still be considered for 'rebind' next time the
evicted list is walked (basically on the next exec), right?
I see. As long as you're sure the VM is completely inactive, and thatIt's the point of teardown that matters here. You can return the pagesGeneralized one might want to think of these as groups of (orMaybe I don't understand what unbind-before-eviction means. To me
perhaps
virtual ranges of) vmas that need to be resident for a single job
submission. Currently we assume the residency-group <-> vm mapping
is
1:1, allowing for the unbind-before-eviction to be ignored, but I
figure moving forward and addressing performance problems of real-
world
applications that may not always be true.
it's
the operation of tearing down all VM mappings (unbind) before
returning
BO pages to the system (evict). I don't see a situation where the
eviction of a BO doesn't imply killing all VM mapping pointing to
this
BO.
to system without tearing down the GPU mappings, as long as you tear
them down before there are any GPU activity on that vm again. In xe we
tear down the old mappings as part of the rebind process after we've
validated the evicted bo again, (but before submitting the GPU job).
The point is the stale mappings can be left as long as there is no gpu
job accessing them.
such mappings are destroyed before the VM is used again, that's fine I
guess. Might be worth a detailed explanation about the different
scenarios though.