Re: [PATCH] drm/ttm: fix error handling in ttm_bo_handle_move_mem()

From: Christian König
Date: Wed Jun 16 2021 - 04:47:21 EST




Am 16.06.21 um 10:37 schrieb Dan Carpenter:
On Wed, Jun 16, 2021 at 08:46:33AM +0200, Christian König wrote:
Sending the first message didn't worked, so let's try again.

Am 16.06.21 um 08:30 schrieb Dan Carpenter:
There are three bugs here:
1) We need to call unpopulate() if ttm_tt_populate() succeeds.
2) The "new_man = ttm_manager_type(bdev, bo->mem.mem_type);" assignment
was wrong and it was really assigning "new_mem = old_mem;". There
is no need for this assignment anyway as we already have the value
for "new_mem".
3) The (!new_man->use_tt) condition is reversed.

Fixes: ba4e7d973dd0 ("drm: Add the TTM GPU memory manager subsystem.")
Signed-off-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
---
This is from reading the code and I can't swear that I have understood
it correctly. My nouveau driver is currently unusable and this patch
has not helped. But hopefully if I fix enough bugs eventually it will
start to work.
Well NAK, the code previously looked quite well and you are breaking it now.

What's the problem with nouveau?

The new Firefox seems to excersize nouveau more than the old one so
when I start 10 firefox windows it just hangs the graphics.

I've added debug code and it seems like the problem is that
nv50_mem_new() is failing.

Sounds like it is running out of memory to me.

Do you have a dmesg?



drivers/gpu/drm/ttm/ttm_bo.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index ebcffe794adb..72dde093f754 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -180,12 +180,12 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
*/
ret = ttm_tt_create(bo, old_man->use_tt);
if (ret)
- goto out_err;
+ return ret;
if (mem->mem_type != TTM_PL_SYSTEM) {
ret = ttm_tt_populate(bo->bdev, bo->ttm, ctx);
if (ret)
- goto out_err;
+ goto err_destroy;
}
}
@@ -193,15 +193,17 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
if (ret) {
if (ret == -EMULTIHOP)
return ret;
- goto out_err;
+ goto err_unpopulate;
}
ctx->bytes_moved += bo->base.size;
return 0;
-out_err:
- new_man = ttm_manager_type(bdev, bo->mem.mem_type);
This here switches new and old manager. E.g. the new_man is now pointing to
the existing resource manager.
Why not just use "old_man" instead of basically the equivalent to
"new_man = old_man"? Can the old_man change part way through the
function?

Good question :)

I don't think that old_man could change and yes that would be much more easier to understand.

Regards,
Christian.


regards,
dan carpenter