Re: [PATCH] x86: mark some mpspec inline functions as __init

From: Borislav Petkov
Date: Thu Feb 25 2021 - 16:33:53 EST


Hi Nick,

On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote:
> So LLVM is telling us bar() was inlined into foo(); (baz() can't be
> because it wasn't defined in this TU). You can use this to "watch"
> the compiler make decisions about inlining.

thanks for taking the time to write all this - it is very interesting
and reminds me that I simply won't have time in this life of mine to
learn about compiler inlining - that's a whole another universe. :-)

I hope you can use that text in a blog post too - it is an interesting
read.

> (full thread: https://lore.kernel.org/lkml/20210225112247.2240389-1-arnd@xxxxxxxxxx/)
> I suspect in this specific case, "Interprocedural Sparse Conditional
> Constant Propagation" sees the calls to the same fn with different
> constants, propagates those down creating two specialized versions of
> the callee (so they are distinct functions now), inlines those into
> get_smp_config()/early_get_smp_config(), then there's too many callers
> of those in a single TU where inlining would cause excessive code
> bloat.

Well, there's exactly one caller of get_smp_config - that's setup_arch().
early_get_smp_config() gets called also exactly once in amd_numa_init().

Now, with my simplistic approach, I can replace the lines at those call
sites by hand with the

x86_init.mpparse.get_smp_config(<arg>);

call. So those become exactly one function call. I still don't see how
that can be done any differently, frankly.

But apparently the cost model has decided that this is not inlineable.
Maybe because that function ptr is assigned at boot time and that
somehow gets the cost model to give it a very high (or low) value. Or
maybe because the wrappers are calling through a variable - the x86_init
thing - which is in a different section and that confuses the inliner.
Or whatever - totally speculating here.

And this brings me to my point - you can't expect people to do all that
crazy dance of compiler introspection and understand cost models and
compiler optimization just to fix stuff like that.

Now, imagine we "fix" this to clang-13's inliner's satisfaction. Now
imagine too that gcc Version Next changes their inliner and that inliner
says that that "fix" is wrong, for whatever reason, bottom up, top down,
whatever. Do you feel the annoyance all around?

And since, as you say, there are no silver bullets here, I think for
cases like that we'll need a "I know what I'm doing Mr. Compiler, TYVM,
even if your cost model says otherwise" facility. And in this case I
still think __always_inline is correct.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette