Re: [RFC PATCH 1/1] smp: Add tracepoints for functions called with smp_call_function*()

From: Peter Zijlstra
Date: Wed May 03 2023 - 10:59:43 EST


On Wed, Apr 19, 2023 at 12:45:08AM -0300, Leonardo Brás wrote:
> On Thu, 2023-04-06 at 11:55 +0200, Peter Zijlstra wrote:
> > On Thu, Apr 06, 2023 at 04:57:18AM -0300, Leonardo Bras wrote:
> > > When running RT workloads in isolated CPUs, many cases of deadline misses
> > > are caused by remote CPU requests such as smp_call_function*().
> > >
> > > For those cases, having the names of those functions running around the
> > > deadline miss moment could help finding a target for the next improvements.
> > >
> > > Add tracepoints for acquiring the funtion name & argument before entry and
> > > after exitting the called function.
> > >
> > > Signed-off-by: Leonardo Bras <leobras@xxxxxxxxxx>
> >
> > How are the patches queued there not sufficient?
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=smp/core
> >
>
> IIUC the last commits add tracepoints that are collected in the
> requesting CPU, at the moment of scheduling the IPI, which are also useful in
> some scenarios.
>
> On my scenario, it could help a little, since it makes possible to filter what
> all other cpus are scheduling on the requested cpu. OTOH it could be also be
> misleading, as the requested cpu could be running something that was scheduled
> way before.
>
> The change I propose does exactly what my scenario need: track exactly which
> function was running at given time in the requested CPU. With this info, we can
> check which (if any) remotely requested function was running on given time
> window.

I was thinking you could simply (graph)-trace
__flush_smp_call_function_queue() with a max_graph_depth or so (Steve
says that ought to work).

But even that might be too specific, your use case sounds more like what
we have the irq-off latency tracer for, and that thing will immediately
tell you what functions were being ran.

> (An unrelated thing I just thought: We could even use the commits you pointed
> together with my proposed change in order to measure how long does it take for a
> requested function to run / complete in the requested cpu)

I don't think you could actually do that; the send tracepoints Valentin
added don't log the csd address, this means you cannot distinguish
two CSDs with the same function send from different CPUs.

To do this you'd need to add the csd address to the the ipi_send
tracepoints and your own (possibly replacing info -- because I don't
think that's too useful).

Valentin -- is any of this something you'd also find useful?