Re: [LSF/MM/BPF TOPIC] tracing the source of errors

From: Dave Chinner
Date: Thu Feb 08 2024 - 21:28:00 EST

On Thu, Feb 08, 2024 at 10:09:36AM +0100, Miklos Szeredi wrote:
> On Wed, 7 Feb 2024 at 22:37, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > ftrace using the function_graph tracer will emit the return values
> > of the functions if you use it with the 'funcgraph-retval' option.
> >
> > Seems like a solved problem?
> Except
> a) this seems exceedingly difficult to set up for non-developers,
> which is often where this is needed. Even strace is pretty verbose
> and the generated output too big, let alone all function calls across
> the whole system.

trace-cmd is your friend.

# trace-cmd record -p function_graph -l vfs_statx sleep 1
# trace-cmd report

There's also 'perf ftrace ...' as another wrapper for ftrace based
profiling, though I have never used that at all.

> b) can only point to the function was generated. But the same error
> is often generated for several different reasons within the same
> function and the return value doesn't help there.

In most cases knowing waht the function call parameters are
and the return value is enough to determine which error check
failed. Seems like that's within the scope of what ftrace could
provide us with...

> I think a) is the critical one, and possibly the ftrace infrastructure
> could be used for something more friendly that just pointed to the
> function where the error was generated without having to go through
> hoops.


I don't use ftrace directly - never have. trace-cmd has been around
for a long time and does pretty much everything I've ever needed
over the past 15 years. That said, trace-cmd does not export all of
ftrace's features, but most of them are there and there are a lot of
users and developers already familiar with trace-cmd as a way of
seeing what is going on inside the kernel when things go wrong....

Hence I just don't see the "ftrace is difficult to use" argument as
being relevant to "how do we trace the source of the error"


Dave Chinner