Re: [PATCH] tracing/gpu: Add gpu_mem_imported tracepoint

From: Kalesh Singh
Date: Mon Aug 16 2021 - 15:46:17 EST


On Mon, Aug 16, 2021 at 10:22 AM Ørjan Eide <orjan.eide@xxxxxxx> wrote:
>
> On Mon, Jul 26, 2021 at 04:41:31PM +0000, Kalesh Singh wrote:
> > The existing gpu_mem_total tracepoint allows GPU drivers a unifrom way
> > to report the per-process and system-wide GPU memory usage. This
> > tracepoint reports a single total of the GPU private allocations and the
> > imported memory. [1]
> >
> > To allow distinguishing GPU private vs imported memory, add
> > gpu_mem_imported tracepoint.
> >
> > GPU drivers can use this tracepoint to report the per-process and global
> > GPU-imported memory in a uniform way.
>
> Right, as imported dma-buf memory is usually shared between two or more
> processes it will be hard to reconcile system memory usage with process
> private GPU memory and imported dma-buf memory in a single total sum. It
> will be good to improve this and having a separate imported GPU memory
> size is good.
>
> > For backward compatility with already existing implementations of
> > gpu_mem_total by various Android GPU drivers, this is proposed as a new
> > tracepoint rather than additional args to gpu_mem_total.
>
> Is this for compatibility with kernel GPU drivers only, or are there
> user space users of the existing tracepoint that can't cope with it
> being extended?
>
> The eBPF used by Android to track the GPU memory is the only established
> user of the tracepoint that I know about. Can existing versions of the
> eBPF can handle this OK?
>
> If the only worry is GPU kernel drivers then I think that extending the
> existing tracepoint, to add a new field with the imported memory sum,
> will be better. There doesn't appear to be any in-kernel GPU drivers
> implementing this yet, and other GPU kernel drivers can just be extended
> to use the new extended tracepoint. As long as there's some mechanism to
> detect the extended variant of the tracepoint for backports, if the new
> tracepoint is ever backported to older kernels, that shouldn't be a
> problem.

Hi Ørjan. Thank you for your comments.

The compatibility concern was regarding existing user space tools when
devices with older kernels (having the older gpu_mem_total format)
upgrade to newer android releases.

In android, there are 2 user space tools that depend on this
tracepoint: the bpf program to capture the tracepoint data [1] and the
Perfetto profiling tool [2]. I’ve confirmed that Perfetto can handle
both the old and new formats if only a new field is added and the
types of existing fields remain unchanged. For the bpf program, it
turns out we can detect the format from gpu_mem_total/format and load
different versions of the bpf program accordingly.

I'll repost a new version adding only the new imported_mem field.

Thanks,
Kalesh

[1] https://cs.android.com/android/platform/superproject/+/92e677d80bc48772a36ef0d2d9db3195b2dfcf65:frameworks/native/services/gpuservice/bpfprogs/gpu_mem.c;l=51
[2] https://perfetto.dev

>
> --
> Ørjan
>
> > [1] https://lore.kernel.org/r/20200302234840.57188-1-zzyiwei@xxxxxxxxxx/
> >
> > Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx>
> > ---
> > include/trace/events/gpu_mem.h | 51 ++++++++++++++++++++++++----------
> > 1 file changed, 36 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/trace/events/gpu_mem.h b/include/trace/events/gpu_mem.h
> > index 26d871f96e94..b9543abf1461 100644
> > --- a/include/trace/events/gpu_mem.h
> > +++ b/include/trace/events/gpu_mem.h
> > @@ -13,21 +13,7 @@
> >
> > #include <linux/tracepoint.h>
> >
> > -/*
> > - * The gpu_memory_total event indicates that there's an update to either the
> > - * global or process total gpu memory counters.
> > - *
> > - * This event should be emitted whenever the kernel device driver allocates,
> > - * frees, imports, unimports memory in the GPU addressable space.
> > - *
> > - * @gpu_id: This is the gpu id.
> > - *
> > - * @pid: Put 0 for global total, while positive pid for process total.
> > - *
> > - * @size: Size of the allocation in bytes.
> > - *
> > - */
> > -TRACE_EVENT(gpu_mem_total,
> > +DECLARE_EVENT_CLASS(gpu_mem_template,
> >
> > TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> >
> > @@ -51,6 +37,41 @@ TRACE_EVENT(gpu_mem_total,
> > __entry->size)
> > );
> >
> > +/*
> > + * The gpu_memory_total event indicates that there's an update to either the
> > + * global or process total gpu memory counters.
> > + *
> > + * This event should be emitted whenever the kernel device driver allocates,
> > + * frees, imports, unimports memory in the GPU addressable space.
> > + *
> > + * @gpu_id: This is the gpu id.
> > + *
> > + * @pid: Put 0 for global total, while positive pid for process total.
> > + *
> > + * @size: Size of the allocation in bytes.
> > + *
> > + */
> > +DEFINE_EVENT(gpu_mem_template, gpu_mem_total,
> > + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> > + TP_ARGS(gpu_id, pid, size));
> > +
> > +/*
> > + * The gpu_mem_imported event indicates that there's an update to the
> > + * global or process imported gpu memory counters.
> > + *
> > + * This event should be emitted whenever the kernel device driver imports
> > + * or unimports memory (allocated externally) in the GPU addressable space.
> > + *
> > + * @gpu_id: This is the gpu id.
> > + *
> > + * @pid: Put 0 for global total, while positive pid for process total.
> > + *
> > + * @size: Size of the imported memory in bytes.
> > + */
> > +DEFINE_EVENT(gpu_mem_template, gpu_mem_imported,
> > + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> > + TP_ARGS(gpu_id, pid, size));
> > +
> > #endif /* _TRACE_GPU_MEM_H */
> >
> > /* This part must be outside protection */
> >
> > base-commit: ff1176468d368232b684f75e82563369208bc371
> > --
> > 2.32.0.432.gabb21c7263-goog
> >