Re: [RFC 3/3] drm/msm: Add comm/cmdline fields

From: Rob Clark
Date: Wed Apr 19 2023 - 11:00:45 EST


On Wed, Apr 19, 2023 at 6:36 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
>
>
> On 18/04/2023 15:56, Rob Clark wrote:
> > On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
> > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >> On 17/04/2023 21:12, Rob Clark wrote:
> >>> From: Rob Clark <robdclark@xxxxxxxxxxxx>
> >>>
> >>> Normally this would be the same information that can be obtained in
> >>> other ways. But in some cases the process opening the drm fd is merely
> >>> a sort of proxy for the actual process using the GPU. This is the case
> >>> for guest VM processes using the GPU via virglrenderer, in which case
> >>> the msm native-context renderer in virglrenderer overrides the comm/
> >>> cmdline to be the guest process's values.
> >>>
> >>> Exposing this via fdinfo allows tools like gputop to show something more
> >>> meaningful than just a bunch of "pcivirtio-gpu" users.
> >>
> >> You also later expanded with:
> >>
> >> """
> >> I should have also mentioned, in the VM/proxy scenario we have a
> >> single process with separate drm_file's for each guest VM process. So
> >> it isn't an option to just change the proxy process's name to match
> >> the client.
> >> """
> >>
> >> So how does that work - this single process temporarily changes it's
> >> name for each drm fd it opens and creates a context or it is actually in
> >> the native context protocol?
> >
> > It is part of the protocol, the mesa driver in the VM sends[1] this
> > info to the native-context "shim" in host userspace which uses the
> > SET_PARAM ioctl to pass this to the kernel. In the host userspace
> > there is just a single process (you see the host PID below) but it
> > does a separate open() of the drm dev for each guest process (so that
> > they each have their own GPU address space for isolation):
> >
> > DRM minor 128
> > PID MEM ACTIV NAME gpu
> > 5297 200M 82M com.mojang.minecr |██████████████▏ |
> > 1859 199M 0B chrome |█▉ |
> > 5297 64M 9M surfaceflinger | |
> > 5297 12M 0B org.chromium.arc. | |
> > 5297 12M 0B com.android.syste | |
> > 5297 12M 0B org.chromium.arc. | |
> > 5297 26M 0B com.google.androi | |
> > 5297 65M 0B system_server | |
> >
> >
> > [1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
> > [2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050
> >
> >>>
> >>> Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
> >>> ---
> >>> Documentation/gpu/drm-usage-stats.rst | 8 ++++++++
> >>> drivers/gpu/drm/msm/msm_gpu.c | 14 ++++++++++++++
> >>> 2 files changed, 22 insertions(+)
> >>>
> >>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> >>> index 8e00d53231e0..bc90bed455e3 100644
> >>> --- a/Documentation/gpu/drm-usage-stats.rst
> >>> +++ b/Documentation/gpu/drm-usage-stats.rst
> >>> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> >>> time active without considering what frequency the engine is operating as a
> >>> percentage of it's maximum frequency.
> >>>
> >>> +- drm-comm: <valstr>
> >>> +
> >>> +Returns the clients executable path.
> >>
> >> Full path and not just current->comm? In this case probably give it a
> >> more descriptive name here.
> >>
> >> drm-client-executable
> >> drm-client-command-line
> >>
> >> So we stay in the drm-client- namespace?
> >>
> >> Or if the former is absolute path could one key be enough for both?
> >>
> >> drm-client-command-line: /path/to/executable --arguments
> >
> > comm and cmdline can be different. Android seems to change the comm to
> > the apk name, for example (and w/ the zygote stuff cmdline isn't
> > really a thing)
> >
> > I guess it could be drm-client-comm and drm-client-cmdline? Although
> > comm/cmdline aren't the best names, they are just following what the
> > kernel calls them elsewhere.
>
> I wasn't sure what do you plan to do given mention of a path under the
> drm-comm description. If it is a path then comm would be misleading,
> since comm as defined in procfs is not a path, I don't think so at
> least. Which is why I was suggesting executable. But if you remove the
> mention of a path from rst and rather refer to processes' comm value I
> think that is then okay.

Oh, whoops the mention of "path" for comm was a mistake. task->comm
is described as executable name without path, and that is what the
fdinfo field was intending to follow.

> >>> +
> >>> +- drm-cmdline: <valstr>
> >>> +
> >>> +Returns the clients cmdline.
> >>
> >> I think drm-usage-stats.rst text should provide some more text with
> >> these two. To precisely define their content and outline the use case
> >> under which driver authors may want to add them, and fdinfo consumer
> >> therefore expect to see them. Just so everything is completely clear and
> >> people do not start adding them for drivers which do not support native
> >> context (or like).
> >
> > I really was just piggy-backing on existing comm/cmdline.. but I'll
> > try to write up something better.
> >
> > I think it maybe should not be limited just to native context.. for
> > ex. if the browser did somehow manage to create different displays
> > associated with different drm_file instances (I guess it would have to
> > use gbm to do this?) it would be nice to see browser tab names.
>
> Would be cool yes.
>
> My thinking behind why we maybe do not want to blanket add them is
> because for common case is it the same information which can be obtained
> from procfs. Like in igt_drm_clients.c I get the pid and comm from
> /proc/$pid/stat. So I was thinking it is only interesting to add to
> fdinfo for drivers where it could differ by the explicit override like
> you have with native context.

Yeah, I suppose I could define them as drm-client-comm-override and
drm-client-cmdline-override

> It can be added once there is a GL/whatever extension which would allow
> it? (I am not familiar with how browsers manage rendering contexts so
> maybe I am missing something.)
>
> >> But on the overall it sounds reasonable to me - it would be really cool
> >> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
> >> use case (not native context) could show real users.
> >
> > For vrend/virgl, we'd first need to solve the issue that there is just
> > a single drm_file for all guest processes. But really, just don't use
> > virgl. (I mean, like seriously, would you put a gl driver in the
> > kernel? Vrend has access to all guest memory, so this is essentially
> > what you have with virgl. This is just not a sane thing to do.) The
> > only "valid" reason for not doing native-context is if you don't have
> > the src code for your UMD to be able to modify it to talk
> > native-context to virtgpu in the guest. ;-)
>
> I am just observing the current state of things on an Intel based
> Chromebook. :) Presumably the custom name for a context would be
> passable via the virtio-gpu protocol or something?

It is part of the context-type specific protocol. Ie. some parts of
the protocol are "core" and dealt with in virtgpu guest kernel driver.
But on top of that there are various context-types with their own
protocol (ie. virgl, venus, cross-domain, msm native ctx, and some WIP
native ctx types floating around)

BR,
-R

> Regards,
>
> Tvrtko
>
> >
> > BR,
> > -R
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> +
> >>> Implementation Details
> >>> ======================
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> >>> index f0f4f845c32d..1150dcbf28aa 100644
> >>> --- a/drivers/gpu/drm/msm/msm_gpu.c
> >>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> >>> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
> >>> return 0;
> >>> }
> >>>
> >>> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> >>> +
> >>> void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >>> struct drm_printer *p)
> >>> {
> >>> + char *comm, *cmdline;
> >>> +
> >>> + get_comm_cmdline(ctx, &comm, &cmdline);
> >>> +
> >>> drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
> >>> drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
> >>> drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> >>> +
> >>> + if (comm)
> >>> + drm_printf(p, "drm-comm:\t%s\n", comm);
> >>> + if (cmdline)
> >>> + drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> >>> +
> >>> + kfree(comm);
> >>> + kfree(cmdline);
> >>> }
> >>>
> >>> int msm_gpu_hw_init(struct msm_gpu *gpu)