Re: [PATCH 2/3] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

From: Christian König
Date: Fri Jul 14 2023 - 05:57:18 EST


Am 14.07.23 um 11:49 schrieb Asahi Lina:
On 14/07/2023 17.43, Christian König wrote:
Am 14.07.23 um 10:21 schrieb Asahi Lina:
A signaled scheduler fence can outlive its scheduler, since fences are
independencly reference counted. Therefore, we can't reference the
scheduler in the get_timeline_name() implementation.

Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared
dma-bufs reference fences from GPU schedulers that no longer exist.

Signed-off-by: Asahi Lina <lina@xxxxxxxxxxxxx>
---
   drivers/gpu/drm/scheduler/sched_entity.c | 7 ++++++-
   drivers/gpu/drm/scheduler/sched_fence.c  | 4 +++-
   include/drm/gpu_scheduler.h              | 5 +++++
   3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index b2bbc8a68b30..17f35b0b005a 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -389,7 +389,12 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
              /*
            * Fence is from the same scheduler, only need to wait for
-         * it to be scheduled
+         * it to be scheduled.
+         *
+         * Note: s_fence->sched could have been freed and reallocated
+         * as another scheduler. This false positive case is okay, as if
+         * the old scheduler was freed all of its jobs must have
+         * signaled their completion fences.

This is outright nonsense. As long as an entity for a scheduler exists
it is not allowed to free up this scheduler.

So this function can't be called like this.

As I already explained, the fences can outlive their scheduler. That means *this* entity certainly exists for *this* scheduler, but the *dependency* fence might have come from a past scheduler that was already destroyed along with all of its entities, and its address reused.

Well this is function is not about fences, this function is a callback for the entity.


Christian, I'm really getting tired of your tone. I don't appreciate being told my comments are "outright nonsense" when you don't even take the time to understand what the issue is and what I'm trying to do/document. If you aren't interested in working with me, I'm just going to give up on drm_sched, wait until Rust gets workqueue support, and reimplement it in Rust. You can keep your broken fence lifetime semantics and I'll do my own thing.

I'm certainly trying to help here, but you seem to have unrealistic expectations.

I perfectly understand what you are trying to do, but you don't seem to understand that this functionality here isn't made for your use case.

We can adjust the functionality to better match your requirements, but you can't say it is broken because it doesn't work when you use it not in the way it is intended to be used.

You can go ahead and try to re-implement the functionality in Rust, but then I would reject that pointing out that this should probably be an extension to the existing code.

Christian.


~~ Lina