[PATCH AUTOSEL 6.5 10/37] drm/amdkfd: Fix a race condition of vram buffer unref in svm code

From: Sasha Levin
Date: Tue Nov 07 2023 - 07:36:34 EST


From: Xiaogang Chen <xiaogang.chen@xxxxxxx>

[ Upstream commit 709c348261618da7ed89d6c303e2ceb9e453ba74 ]

prange->svm_bo unref can happen in both mmu callback and a callback after
migrate to system ram. Both are async call in different tasks. Sync svm_bo
unref operation to avoid random "use-after-free".

Signed-off-by: Xiaogang Chen <xiaogang.chen@xxxxxxx>
Reviewed-by: Philip Yang <Philip.Yang@xxxxxxx>
Reviewed-by: Jesse Zhang <Jesse.Zhang@xxxxxxx>
Tested-by: Jesse Zhang <Jesse.Zhang@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 5ff1a5a89d968..ed365f8ebf53f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -617,8 +617,15 @@ svm_range_vram_node_new(struct kfd_node *node, struct svm_range *prange,

void svm_range_vram_node_free(struct svm_range *prange)
{
- svm_range_bo_unref(prange->svm_bo);
- prange->ttm_res = NULL;
+ /* serialize prange->svm_bo unref */
+ mutex_lock(&prange->lock);
+ /* prange->svm_bo has not been unref */
+ if (prange->ttm_res) {
+ prange->ttm_res = NULL;
+ mutex_unlock(&prange->lock);
+ svm_range_bo_unref(prange->svm_bo);
+ } else
+ mutex_unlock(&prange->lock);
}

struct kfd_node *
--
2.42.0