Re: [PATCH] x86: mark some mpspec inline functions as __init

From: Nick Desaulniers
Date: Thu Feb 25 2021 - 16:59:51 EST


On Thu, Feb 25, 2021 at 1:33 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote:
> > (full thread: https://lore.kernel.org/lkml/20210225112247.2240389-1-arnd@xxxxxxxxxx/)
> > I suspect in this specific case, "Interprocedural Sparse Conditional
> > Constant Propagation" sees the calls to the same fn with different
> > constants, propagates those down creating two specialized versions of
> > the callee (so they are distinct functions now), inlines those into
> > get_smp_config()/early_get_smp_config(), then there's too many callers
> > of those in a single TU where inlining would cause excessive code
> > bloat.
>
> Well, there's exactly one caller of get_smp_config - that's setup_arch().
> early_get_smp_config() gets called also exactly once in amd_numa_init().
>
> Now, with my simplistic approach, I can replace the lines at those call
> sites by hand with the
>
> x86_init.mpparse.get_smp_config(<arg>);
>
> call. So those become exactly one function call. I still don't see how
> that can be done any differently, frankly.
>
> But apparently the cost model has decided that this is not inlineable.
> Maybe because that function ptr is assigned at boot time and that
> somehow gets the cost model to give it a very high (or low) value. Or
> maybe because the wrappers are calling through a variable - the x86_init
> thing - which is in a different section and that confuses the inliner.
> Or whatever - totally speculating here.

The config that reproduces it wasn't shared here; I wouldn't be
surprised if this was found via randconfig that enabled some config
that led to excessive code bloat somewhere somehow.

>
> And this brings me to my point - you can't expect people to do all that
> crazy dance of compiler introspection and understand cost models and
> compiler optimization just to fix stuff like that.

Oh, I don't expect everyone to; just leaving breadcrumbs showing other
people on thread how to fish. ;)

>
> Now, imagine we "fix" this to clang-13's inliner's satisfaction. Now
> imagine too that gcc Version Next changes their inliner and that inliner
> says that that "fix" is wrong, for whatever reason, bottom up, top down,
> whatever. Do you feel the annoyance all around?

Yes, mutually unsatisfiable cases are painful, but I don't think
that's what's going on here.

>
> And since, as you say, there are no silver bullets here, I think for
> cases like that we'll need a "I know what I'm doing Mr. Compiler, TYVM,
> even if your cost model says otherwise" facility. And in this case I
> still think __always_inline is correct.

Sure, it doesn't really matter to me which way this is fixed. I
personally prefer placing functions in the correct sections and
letting the compiler be flexible, since if all of this is to satisfy
some randconfig then __always_inline is making a decision for all
configs, but perhaps it doesn't matter.
--
Thanks,
~Nick Desaulniers