Re: [PATCH 0/1] Always record job cycle and timestamp information

From: Daniel Vetter
Date: Fri Feb 16 2024 - 11:57:30 EST


On Wed, Feb 14, 2024 at 01:52:05PM +0000, Steven Price wrote:
> Hi Adrián,
>
> On 14/02/2024 12:14, Adrián Larumbe wrote:
> > A driver user expressed interest in being able to access engine usage stats
> > through fdinfo when debugfs is not built into their kernel. In the current
> > implementation, this wasn't possible, because it was assumed even for
> > inflight jobs enabling the cycle counter and timestamp registers would
> > incur in additional power consumption, so both were kept disabled until
> > toggled through debugfs.
> >
> > A second read of the TRM made me think otherwise, but this is something
> > that would be best clarified by someone from ARM's side.
>
> I'm afraid I can't give a definitive answer. This will probably vary
> depending on implementation. The command register enables/disables
> "propagation" of the cycle/timestamp values. This propagation will cost
> some power (gates are getting toggled) but whether that power is
> completely in the noise of the GPU as a whole I can't say.
>
> The out-of-tree kbase driver only enables the counters for jobs
> explicitly marked (BASE_JD_REQ_PERMON) or due to an explicit connection
> from a profiler.
>
> I'd be happier moving the debugfs file to sysfs rather than assuming
> that the power consumption is small enough for all platforms.
>
> Ideally we'd have some sort of kernel interface for a profiler to inform
> the kernel what it is interested in, but I can't immediately see how to
> make that useful across different drivers. kbase's profiling support is
> great with our profiling tools, but there's a very strong connection
> between the two.

Yeah I'm not sure whether a magic (worse probably per-driver massively
different) file in sysfs is needed to enable gpu perf monitoring stats in
fdinfo.

I get that we do have a bit a gap because the linux perf pmu stuff is
global, and you want per-process, and there's kinda no per-process support
for perf stats for devices. But that's probably the direction we want to
go, not so much fdinfo. At least for hardware performance counters and
things like that.

Iirc the i915 pmu support had some integration for per-process support,
you might want to chat with Tvrtko for kernel side and Lionel for more
userspace side. At least if I'm not making a complete mess and my memory
is vaguely related to reality. Adding them both.

Cheers, Sima


>
> Steve
>
> > Adrián Larumbe (1):
> > drm/panfrost: Always record job cycle and timestamp information
> >
> > drivers/gpu/drm/panfrost/Makefile | 2 --
> > drivers/gpu/drm/panfrost/panfrost_debugfs.c | 21 ------------------
> > drivers/gpu/drm/panfrost/panfrost_debugfs.h | 14 ------------
> > drivers/gpu/drm/panfrost/panfrost_device.h | 1 -
> > drivers/gpu/drm/panfrost/panfrost_drv.c | 5 -----
> > drivers/gpu/drm/panfrost/panfrost_job.c | 24 ++++++++-------------
> > drivers/gpu/drm/panfrost/panfrost_job.h | 1 -
> > 7 files changed, 9 insertions(+), 59 deletions(-)
> > delete mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.c
> > delete mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.h
> >
> >
> > base-commit: 6b1f93ea345947c94bf3a7a6e668a2acfd310918
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch