Re: [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug

From: Christian König
Date: Tue Sep 06 2022 - 03:10:35 EST




Am 05.09.22 um 18:35 schrieb Arvind Yadav:
Here's on debug enabling software signaling for the stub fence
which is always signaled. This fence should enable software
signaling otherwise the AMD GPU scheduler will cause a GPU reset
due to a GPU scheduler cleanup activity timeout.

Signed-off-by: Arvind Yadav <Arvind.Yadav@xxxxxxx>
---

Changes in v1 :
1- Addressing Christian's comment to remove unnecessary callback.
2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
3- The version of this patch is also changed and previously
it was [PATCH 3/4]

---
drivers/dma-buf/dma-fence.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 066400ed8841..2378b12538c4 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
static DEFINE_SPINLOCK(dma_fence_stub_lock);
static struct dma_fence dma_fence_stub;
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+static bool __dma_fence_enable_signaling(struct dma_fence *fence);
+#endif
+

I would rename the function to something like dma_fence_enable_signaling_locked().

And please don't add any #ifdef if it isn't absolutely necessary. This makes the code pretty fragile.

/*
* fence context counter: each execution context should have its own
* fence context, this allows checking if fences belong to the same
@@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void)
&dma_fence_stub_ops,
&dma_fence_stub_lock,
0, 0);
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+ __dma_fence_enable_signaling(&dma_fence_stub);
+#endif

Alternatively in this particular case you could just set the bit manually here since this is part of the dma_fence code anyway.

Christian.

dma_fence_signal_locked(&dma_fence_stub);
}
spin_unlock(&dma_fence_stub_lock);