Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

From: Jeremy Fitzhardinge
Date: Fri Apr 17 2009 - 13:10:00 EST


Steven Rostedt wrote:
I was talking with Arjan about this in San Francisco. The expense of doing function calls. He told me (and he can correct me if I'm wrong here) that function calls are like branch predictions. The branch part is the fact that a retq is a jmp that can go to different locations. There's logic in the CPU to match calls with retqs to speed this up.

Right. The call is to a fixed address, so there's no prediction needed at all; the CPU can immediately start fetching instructions at the call target without missing a beat. When it hits the ret in the function, assuming nobody has been playing games with the stack pointer or modifying the return address on the stack, it can just look up the return address from its cache and start fetching from there, again with no bubbles. It should be very close to a pair of jumps, aside from one extra memory write (for the return address on stack) - and that shouldn't be too bad, because the chances are the cache is hot for the stack.

He also told me that the "mcount" retq that I do is actually more expensive. The logic does not expect a function to return immediately. (for stubs, I'm not sure that was a good design).

Hence,

call mcount

[...]

mcount:
retq


is expensive, compared to a call to a function that actually does something.

Again, Arjan can correct me here, since I'm just trying to paraphrase what he told me.

Sounds reasonable; it takes a little while for the CPU to work out what the return address will be, even though its cached, so doing an immediate ret will cause a bubble while it sorts itself out. But that shouldn't be an issue for the calls I'm talking about.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/