Re: perf, x86: Provide a PEBS capable cycle event

From: Ingo Molnar
Date: Wed Jan 26 2011 - 08:58:23 EST



* Stephane Eranian <eranian@xxxxxxxxxx> wrote:

> On Wed, Jan 26, 2011 at 1:06 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > * Stephane Eranian <eranian@xxxxxxxxxx> wrote:
> >
> >> On Wed, Jan 26, 2011 at 12:37 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >> >
> >> > * Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx> wrote:
> >> >
> >> >> Gitweb: Â Â http://git.kernel.org/linus/7639dae0ca11038286bbbcda05f2bef601c1eb8d
> >> >> Commit: Â Â 7639dae0ca11038286bbbcda05f2bef601c1eb8d
> >> >> Parent: Â Â abe43400579d5de0078c2d3a760e6598e183f871
> >> >> Author: Â Â Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> >> >> AuthorDate: Tue Dec 14 21:26:40 2010 +0100
> >> >> Committer: ÂIngo Molnar <mingo@xxxxxxx>
> >> >> CommitDate: Thu Dec 16 11:36:44 2010 +0100
> >> >>
> >> >> Â Â perf, x86: Provide a PEBS capable cycle event
> >> >>
> >> >> Â Â Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> >> >> Â Â LKML-Reference: <new-submission>
> >> >> Â Â Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> >> >> ---
> >> >> Âarch/x86/kernel/cpu/perf_event_intel.c | Â 26 ++++++++++++++++++++++++++
> >> >> Â1 files changed, 26 insertions(+), 0 deletions(-)
> >> >
> >> > btw., precise profiling via PEBS:
> >> >
> >> > Âperf record -e cycles:p ...
> >> >
> >> > works pretty nicely now on Nehalem CPUs and later.
> >> >
> >> The problem is that cycles:p is not equivalent to cycles in terms of how
> >> cycles are counted. cycles counts only unhalted cycles. cycles:p counts
> >> ALL cycles, event when the CPU is in halted state.
> >
> > That's not really an issue in practice: it at most can cause a bit larger value for:
> >
> >   2.38%    swapper Â[kernel.kallsyms]   Â[k] mwait_idle_with_hints               â
> >
> > Which entry exists with regular cycles event _anyway_, because every irq entry ends
> > up there.
> >
>
> There is a difference in interpretation. Because now when you get samples in those
> idle routines, you cannot tell whether it is because you actually execute code
> there or because you were halted (not executing) and now sampling has altered the
> behavior of the system in that you wake up from halted state to service a PMU
> interrupt.

The thing is, most people are not interested in seeing the idle routine entry
anyway, so we already exclude it in say 'perf top' output, see the skip_symbols[]
array in builtin-top.c.

So utility seems rather low.

If we contrast it to the utility of having precise PEBS sampling, which dramatically
improves *all* profiling data and which improves the reading of annotated profiling
output beyond measure, the default path to go here seems rather obvious. Agreed?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/