Re: [Freedreno] [PATCH v12 07/13] drm/msm: Add a context pointer to the submitqueue

From: Rob Clark
Date: Thu Aug 13 2020 - 12:16:42 EST


On Mon, Aug 10, 2020 at 3:27 PM Jordan Crouse <jcrouse@xxxxxxxxxxxxxx> wrote:
>
> Each submitqueue is attached to a context. Add a pointer to the
> context to the submitqueue at create time and refcount it so
> that it stays around through the life of the queue.
>
> GPU submissions can access the active context via the submitqueue
> instead of requiring it to be passed around from function to
> function.
>
> Signed-off-by: Jordan Crouse <jcrouse@xxxxxxxxxxxxxx>
> ---
>
> drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 12 +++++-------
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 ++---
> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 5 ++---
> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 3 +--
> drivers/gpu/drm/msm/msm_drv.c | 3 ++-
> drivers/gpu/drm/msm/msm_drv.h | 8 ++++++++
> drivers/gpu/drm/msm/msm_gem.h | 1 +
> drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++++----
> drivers/gpu/drm/msm/msm_gpu.c | 9 ++++-----
> drivers/gpu/drm/msm/msm_gpu.h | 7 +++----
> drivers/gpu/drm/msm/msm_submitqueue.c | 8 +++++++-
> 11 files changed, 39 insertions(+), 30 deletions(-)
>

[snip]

> diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> index 972490b14ba5..9c573c4269cb 100644
> --- a/drivers/gpu/drm/msm/msm_gem.h
> +++ b/drivers/gpu/drm/msm/msm_gem.h
> @@ -142,6 +142,7 @@ struct msm_gem_submit {
> bool valid; /* true if no cmdstream patching needed */
> bool in_rb; /* "sudo" mode, copy cmds into RB */
> struct msm_ringbuffer *ring;
> + struct msm_file_private *ctx;

So, it looks like this is (currently) unused, and everything is
instead using submit->queue->ctx

That said, changing this so the submit also holds a ref to the ctx
seems to fix the intermittent splat I can trigger by repeatedly
hanging the gpu. Which (from the pile of additional tracepoints I've
added on top of this series) seems to be related to re-playing submits
after the userspace process has crashed and/or closed the device.

It seems like the reference the submit holds to the queue should keep
the ctx (and therefore address space) alive, but I need to dig through
that a bit more.

BR,
-R

> unsigned int nr_cmds;
> unsigned int nr_bos;
> u32 ident; /* A "identifier" for the submit for logging */