Re: [GIT PULL] perf fixes

From: Steven Rostedt
Date: Fri Jun 22 2012 - 14:50:39 EST


On Fri, 2012-06-22 at 11:07 -0700, Linus Torvalds wrote:
> On Fri, Jun 22, 2012 at 6:36 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > Steven Rostedt (1):
> > ftrace: Make all inline tags also include notrace
>
> Btw, this is something I've been wondering about: function call
> tracing and inlining seems to be fundamentally incompatible.

True that the -pg option never adds the mcount call to any function that
gets inlined. But just an FYI, the alternative -finstrument-functions,
which traces both start and stop of the function, even does inlined
functions. Which one one of the reasons I totally avoided it.

>
> And gcc can (and does, depending on version and details) inline pretty
> much any static function, whether we mark it inline or not.

Right, which means that those do not get traced either.

>
> Now, there's no question that we don't want inlined functions to be
> traced, but that actually means that the *logical* thing would be to
> try to somehow tell gcc to not ever do the whole stupid mcount thing
> for functions that *might* be inlined - and at least be consistent
> about it.

Hmm, I'm not sure how to tell gcc that :-/

>
> IOW, is there some way to get the mcount thing to only happen for
> functions that either have their address taken, or have external
> visibility?
>
> Because that mcount thing is expensive as hell, if people haven't
> noticed (and I'm not talking about just the call instruction that I
> think we can stub out

It is stubbed out. Has been since day one, when DYNAMIC_FTRACE is
supported and enabled.

> - it changes code generation in other ways too).

One thing it does, which I hate, is that it enables (forces) frame
pointers.

> And it looks like distros enable it by default, which annoys my
> performance-optimizing soul deeply.

We have been working on an alternative. That is the -mfentry, and I have
working code that is still in the testing phase (and looking good!).

When you add -mfentry with -pg instead of calling mcount, which comes
after the frame pointer has been set up, it calls fentry, as the very
first instruction in the function. It should not interfere with any
other code generation.

The -mfentry is supported since gcc 4.6.0 and only for x86 (thanks to
Andi Kleen for doing this. Here's my first email that described it a
little (I've been working on various versions, but it has settled down
recently):

https://lkml.org/lkml/2011/2/9/271

Back then, one of the issues I worried about was its interaction with
kprobes. As kprobes commonly are inserted at the first instruction of a
function, and kprobes could not be inserted at a ftrace nop, this would
cause issues for users wanting to insert their probe at the beginning of
the function.

But Masami has already helped me in fixing this up. Both the kprobes
issue and the fentry and no framepointers code is working well. I was
going to push it for 3.7 so that I can vindicate it a bit more.

Here's my last RFC patch set for the kprobes/ftrace work. Where kprobes
can actually be optimized by using the ftrace infrastructure.

https://lkml.org/lkml/2012/6/12/668

-- Steve


>
> So doing it a bit less would be lovely.
>
> Linus


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/