Re: [PATCH 2/4] ftrace - add function_duration tracer

From: Ingo Molnar
Date: Thu Dec 10 2009 - 13:35:43 EST



* Frank Ch. Eigler <fche@xxxxxxxxxx> wrote:

> Hi -
>
> > > FWIW, those who want to collect such measurements today can do so
> > > with a few lines of systemtap script for each of the above.
> >
> > Well, i dont think stap can do workload instrumentation. It can do
> > system-wide (and user local / task local) - but can it do per task
> > hierarchies?
>
> It can track the evolution of task hierarchies by listening to process
> forking events, and filter other kernel/user events according to
> then-current hierarchy data. One primitive implementation of this is
> in the target_set.stp tapset, but it's easy to script up other
> policies.

target_set.stp is not really adequate. Have you actually _tried_ to use
it on something real like hackbench, which runs thousands (or tens of
thousands) of tasks? You'll soon find that associative arrays are not
really adequate for that ...

Another problem i can see is that target_set.stp starts with:

global _target_set # map: target-set-pid -> ancestor-pid

see that 'global' thing? It's a system global variable - i.e. you cannot
measure two task hierarchies at once.

> > Also, i dont think stap supports proper separation of per workload
> > measurements either. I.e. can you write a script that will work
> > properly even if multiple monitoring tools are running, each trying
> > to measure latencies?
>
> Sure, always has. You can run many scripts concurrently, each with
> its own internal state. (Overheads accumulate, sadly & naturally.)

To measure latencies you need two probes, a start and a stop one. How do
you define a local variable that is visible to those two probes? You
have to create a global variable - but that will/can clash with other
instances.

( Also, you dont offer per application channels/state from the same
script. Each app has to define their own probes, duplicating the
script and increasing probe chaining overhead. )

The whole state sharing and eventing model of SystemTap is poorly
thought out.

> > Also, i personally find built-in kernel functionality more trustable
> > than dynamically built stap kernel modules that get inserted.
>
> I understand. In the absence of a suitable bytecode engine in the
> kernel, this was the only practical way to do everything we needed.

You seem to be under the mistaken assumption that your course of action
with SystemTap is somehow limited by what is available (or not) in the
upstream kernel.

In reality you can implement anything you want (in fact you did
precisely that - _against_ repeated advice of upstream kernel
developers), and if it's good, it will be merged - simple as that. It
might take years, but once you deliver the proof (which comes in form of
lots of happy users/developers), it happens.

So saying 'but the kernel does not have a bytecode interpreter' (or any
other excuse) is pretty lame.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/