Re: [RFC PATCH 1/1] smp: Add tracepoints for functions called with smp_call_function*()

From: Leonardo Bras Soares Passos
Date: Wed May 03 2023 - 11:54:48 EST


On Wed, May 3, 2023 at 12:17 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Apr 19, 2023 at 12:45:08AM -0300, Leonardo Brįs wrote:
> > On Thu, 2023-04-06 at 11:55 +0200, Peter Zijlstra wrote:
> > > On Thu, Apr 06, 2023 at 04:57:18AM -0300, Leonardo Bras wrote:
> > > > When running RT workloads in isolated CPUs, many cases of deadline misses
> > > > are caused by remote CPU requests such as smp_call_function*().
> > > >
> > > > For those cases, having the names of those functions running around the
> > > > deadline miss moment could help finding a target for the next improvements.
> > > >
> > > > Add tracepoints for acquiring the funtion name & argument before entry and
> > > > after exitting the called function.
> > > >
> > > > Signed-off-by: Leonardo Bras <leobras@xxxxxxxxxx>
> > >
> > > How are the patches queued there not sufficient?
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=smp/core
> > >
> >
> > IIUC the last commits add tracepoints that are collected in the
> > requesting CPU, at the moment of scheduling the IPI, which are also useful in
> > some scenarios.
> >
> > On my scenario, it could help a little, since it makes possible to filter what
> > all other cpus are scheduling on the requested cpu. OTOH it could be also be
> > misleading, as the requested cpu could be running something that was scheduled
> > way before.
> >
> > The change I propose does exactly what my scenario need: track exactly which
> > function was running at given time in the requested CPU. With this info, we can
> > check which (if any) remotely requested function was running on given time
> > window.
>
> I was thinking you could simply (graph)-trace
> __flush_smp_call_function_queue() with a max_graph_depth or so (Steve
> says that ought to work).
>
> But even that might be too specific, your use case sounds more like what
> we have the irq-off latency tracer for, and that thing will immediately
> tell you what functions were being ran.
>
> > (An unrelated thing I just thought: We could even use the commits you pointed
> > together with my proposed change in order to measure how long does it take for a
> > requested function to run / complete in the requested cpu)
>
> I don't think you could actually do that; the send tracepoints Valentin
> added don't log the csd address, this means you cannot distinguish
> two CSDs with the same function send from different CPUs.
>
> To do this you'd need to add the csd address to the the ipi_send
> tracepoints and your own (possibly replacing info -- because I don't
> think that's too useful).

Sure, I will improve this in my patch.

You think I should rebase it on top of tip/smp/core in order to add it
to the set?

Thanks!
Leo

>
> Valentin -- is any of this something you'd also find useful?
>