Re: [RFC PATCH 0/4] x86/percpu: Use segment qualifiers

From: Uros Bizjak
Date: Tue Oct 03 2023 - 05:49:16 EST


On Mon, Oct 2, 2023 at 3:22 PM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>
> * Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> > > > Clang isn't much better, but at least it doesn't generate bad code. It
> > > > just crashes with an internal compiler error on the above trivial
> > > > test-case:
> > > >
> > > > fatal error: error in backend: cannot lower memory intrinsic in
> > > > address space 257
> > > >
> > > > which at least tells the user that they can't copy memory from that
> > > > address space. But once again shows that no, this feature is not ready
> > > > for prime-time.
> > > >
> > > > If literally the *first* thing I thought to test was this broken, what
> > > > else is broken in this model?
> > > >
> > > > And no, the kernel doesn't currently do the above kinds of things.
> > > > That's not the point. The point was "how well is this compiler support
> > > > tested". The answer is "not at all".
> >
> > I don't agree with the above claims. The generated code was the product
> > of a too limited selection of available copy algorithms in unusual
> > circumstances, but even in the case of generic fallback code, the
> > generated code was *correct*. As said in the previous post, and
> > re-confirmed by the patch in the PR, the same code in GCC handles
> > implicit (__thread) and named address spaces. At the end of the day, the
> > problematic code was merely a missing-optimization (the bug with the
> > lowest severity in GCC).
>
> Yeah, so the feature generated really crappy code for Linus's
> testcase. That's never a good sign for compiler features, full stop.
>
> Do we want the kernel to be at the bleeding edge of an
> unusual compiler feature that is only used internally by the
> compiler in a very specific fashion?

I understand your reservations about the new feature, but please allow
me to bring up some facts about it. The feature is *far* from being an
unusual internal compiler feature in GCC. It was introduced for gcc-6
(mostly for embedded targets), but in 2017 the complete thread local
storage (e.g. __thread) handling was switched to named address space
and released in gcc-8. I don't remember any reported problem with this
feature (and I know what I'm talking about, the switch to named AS in
the compiler was done by myself [1]). Since named AS uses the same
compiler code, I'm quite confident that it shouldn't be considered a
bleeding edge unusual compiler feature.

The "crappy code" happened due to the limitation in the completely
different part of the compiler. It was stringop algorithm selection, a
feature orthogonal to address spaces (tangential only in the sense
that several algorithms are not available with non-default address
space). Colorful language notwithstanding, the generated code is
nothing more than "missed-optimization" and the gcc bugreport was
qualified as that. With -mstringop-strategy=loop, the testcase
produces expected code, and the stringop selection in the compiler
will be adjusted accordingly also for the very limited selection of
stringop algorithms for named AS. Even when the kernel will never use
these algorithms for its percpu functionality...

> Maybe, but Linus's reluctance and caution is justified IMHO,
> and at minimum this feature needs some careful evaluation of
> long-time suitability [*] ...

I do have a proposal on how to introduce a new feature while
minimising risk as much as possible. I must admit that detecting the
feature for all released compilers that can handle __seg_gs seems
quite risky (I have to curb my enthusiasm somehow ;) ), so perhaps a
staged approach with a currently unreleased compiler is more
appropriate. Using:

+config CC_HAS_NAMED_AS
+ def_bool CC_IS_GCC && GCC_VERSION >= 140000

would enable the new feature only for currently unreleased
experimental GCC. This would allow to qualify the compiler and weed
out any possible problems, before the compiler is released. Please
note, that the patch is written in such a way, that the code reverts
to the existing approach for undefined CC_HAS_NAMED_AS.

I would also like to point out additional possible improvements that
are not in the proposed patch. Using named AS enables the kernel to
move one step further to PIE, as noted in the original patch
submission [2].

> [*] euphemism for: "I have no idea how to evaluate this risk"... :-/

[1] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=cfc72af0fbdf
[2] https://lore.kernel.org/lkml/20190823224424.15296-1-namit@xxxxxxxxxx/

Thanks,
Uros.