[PATCH] drm/msm/a6xx: Make GPU destroy a bit safer

From: Douglas Anderson
Date: Thu Feb 02 2023 - 13:49:19 EST


If, for whatever reason, we're trying process adreno_runtime_resume()
at the same time that a6xx_destroy() is running then things can go
boom. Specifically adreno_runtime_resume() will eventually call
a6xx_pm_resume() and that may try to resume the gmu.

Let's grab the GMU lock as we're destroying the GMU. That will solve
the race because a6xx_pm_resume() grabs the same lock. That makes the
access of `gmu->initialized` in a6xx_gmu_resume() safe.

We'll also return an error code in a6xx_gmu_resume() if we see that
`gmu->initialized` was false. If this happens we'll bail out of the
rest of a6xx_pm_resume(), which is good because the rest of that
function is also not good to do if we're racing with a6xx_destroy().

Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
---
This doesn't _really_ matter for upstream, but downstream in ChromeOS
we have a GPU inputboost patch. That inputboost patch was related to
adreno_runtime_resume() getting called at the same time that
a6xx_destroy() was running. This was seen at bootup when the panel
failed to probe.

Despite the fact that this isn't truly fixing any bugs upstream, it
still seems like a general improvement for the GPU driver.

drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index f3c9600221d4..7f5bc73b2040 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -974,7 +974,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
int status, ret;

if (WARN(!gmu->initialized, "The GMU is not set up yet\n"))
- return 0;
+ return -EINVAL;

gmu->hung = false;

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index aae60cbd9164..6faea5049f76 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1746,7 +1746,9 @@ static void a6xx_destroy(struct msm_gpu *gpu)

a6xx_llc_slices_destroy(a6xx_gpu);

+ mutex_lock(&a6xx_gpu->gmu.lock);
a6xx_gmu_remove(a6xx_gpu);
+ mutex_unlock(&a6xx_gpu->gmu.lock);

adreno_gpu_cleanup(adreno_gpu);

--
2.39.1.519.gcb327c4b5f-goog