Re: [PATCH 3/6] dax: add tracepoint infrastructure, PMD tracing

From: Linus Torvalds
Date: Fri Nov 25 2016 - 14:51:34 EST


On Thu, Nov 24, 2016 at 11:37 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> My impression is that nobody (at least kernel-side) wants them to be
> a stable ABI, so long as nobody in userland screams about their code
> being broken, everything is fine. As usual, if nobody notices an ABI
> change, it hasn't happened. The question is what happens when somebody
> does.

Right. There is basically _no_ "stable API" for the kernel anywhere,
it's just an issue of "you can't break workflow for normal people".

And if somebody writes his own trace scripts, and some random trace
point goes away (or changes semantics), that's easy: he can just fix
his script. Tracepoints aren't ever going to be stable in that sense.

But when then somebody writes a trace script that is so useful that
distros pick it up, and people start using it and depending on it,
then _that_ trace point may well have become effectively locked in
stone.

That's happened once already with the whole powertop thing. It didn't
get all that widely spread, and the fix was largely to improve on
powertop to the point where it wasn't a problem any more, but we've
clearly had one case of this happening.

But I suspect that something like powertop is fairly unusual. There is
certainly room for similar things in the VFS layer (think "better
vmstat that uses tracepoints"), but I suspect the bulk of tracepoints
are going to be for random debugging (so that developers can say
"please run this script") rather than for an actual user application
kind of situation.

So I don't think we should be _too_ afraid of tracepoints either. When
being too anti-tracepoint is a bigger practical problem than the
possible problems down the line, the balance is wrong.

As long as tracepoints make sense from a higher standpoint (ie not
just random implementation detail of the day), and they aren't
everywhere, they are unlikely to cause much problem.

We do have filesystem code that is just disgusting. As an example:
fs/afs/ tends to have these crazy "_enter()/_exit()" macros in every
single function. If you want that, use the function tracer. That seems
to be just debugging code that has been left around for others to
stumble over. I do *not* believe that we should encourage that kind of
"machine gun spray" use of tracepoints.

But tracing actual high-level things like IO and faults? I think that
makes perfect sense, as long as the data that is collected is also the
actual event data, and not so much a random implementation issue of
the day.

Linus