Re: Fix powerTOP regression with 2.6.39-rc5

From: Ingo Molnar
Date: Tue May 10 2011 - 04:42:35 EST



* Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> > Check whether there's any feature missing from it that you'd like to see, add
> > it. Rinse, repeat.
>
> Again, the design of trace/perf is task oriented. Ftrace is system
> oriented. Could we agree on that?

Like i said in the previous mail, i don't know where you got this nonsensical
idea from. ftrace is indeed system oriented and that's hardcoded at the design
- i.e. its a design mistake.

perf is fundamentally *event* oriented - and various levels of grouping and
buffering can be applied to events.

'system wide', 'per cpu', 'per workload', 'per task' or 'per cgroup' are just
one of the many natural groupings of events that users/developers would like to
see - and we offer these.

- that is why sysprof is using perf events to collect system-wide events.

- that is why PowerTOP uses perf events in system-wide event collection mode.

- that is why 'perf top' uses system wide profiling by default (but can do per
CPU or per task profiling as well)

- that is why 'perf record' defaults to a per workload (not a per task as you
claim) mode of event collection

- that is why 'perf stat' defalts to per workload events

Do you see that it is ftrace that remained behind the times, by stubbornly
forcing some nonsensical global view and encoding it not only in its design but
in its APIs as well?

I really meant it when i told you that perf events were the natural next step
after ftrace, in the evolution of Linux tracing/instrumentation.

> > > Now that perf has entered the tracing field, I would be happy to bring
> > > the two together. [...]
> >
> > Great - please see tip:tmp.perf/trace, that would be a very good point to
> > start. It's a working prototype for an ftrace-alike tracing workflow.
>
> I'll do it, if we can agree about the ftrace as system tracing/debugging, and
> trace can focus on user specific tracing.

Ok, you've finally admitted that you do not really want 'unification' between
ftrace and perf - which was my suspicion all along. I really prefer 100% honest
discussions with people from whom i pull and it took quite some time for you to
admit to this position ...

Despite what you say perf and 'trace' can do system-wide tracing just fine:

$ trace record -a
^C
# trace recorded [205.108 MB] - try 'trace summary' to get an overview

( and note that the code in tip:tmp.perf/trace2 is a very early prototype,
barely tested - it just demonstrates the idea. )

In fact we could make 'trace' default to system-wide tracing by default and it
would fall back to workload level tracing only if it does not have the
privileges to trace the whole system.

Why not use the correctly designed tracing approach and enhance it, and merge
all the remaining useful bits of ftrace into it?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/