Re: [PATCH] lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels

From: Bill Wendling
Date: Fri Aug 25 2023 - 20:10:16 EST


On Fri, Aug 25, 2023 at 4:35 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 25 Aug 2023 at 15:57, Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > >
> > Another idea is that there are __builtin_* functions for a lot of
> > functions that are currently in inline asm
>
> No. We've been through this before. The builtins are almost entirely
> untested, and often undocumented and buggy.
>
> > The major issue with the
> > `__builtin_ia32_readeflags_*` was its inability to take unrelated MSRs
> > into account during code motion. That may not be the same worry here?
>
> No, the problem with __builtin_ia32_readeflags_*() was that it was
> literally completely buggy and generated entirely broken code:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104971
>
> but that's really more of a symptom than anything else.
>
> It's a symptom of the fact that unlike inline asm's, those builtins
> are often undocumented in what compiler version they appeared, and are
> of very questionable quality. They often don't have many users, and
> the test suites are non-existent.
>
> For example, we *do* use __builtin_ffs() on x86 for constant values,
> because the compiler does the right thing.
>
> But for non-constant ones, the inline asm actually generates better
> code: gcc generatea some disgusting mess with a 'bsf' followed by a
> 'cmov' for the zero case, when we know better.
>
> See for example
>
> https://godbolt.org/z/jKKf48Wsf
>
Ew...gross.

> I don't understand why compiler people prefer a builtin that is an
> untested special case that assumes that the compiler knows what is
> going on (and often doesn't), over a generic escape facility that is
> supported and needed anyway (inline asm).
>
> In other words: the statement "builtins generate better code" is
> simply PROVABLY NOT TRUE.
>
> Builtins have often generated *worse* code than using inline asms, to
> the point where "worse" is actively buggy crap.
>
> At least inline asms are reliable. That's a *big* deal.
>
There are a couple of reasons why compiler writers (at least I) prefer
builtins to inline asm. Inline asm takes control away from the
compiler, which makes it harder for the compiler to perform normal
optimizations. It's more portable. As for better code, it won't
generate it in all situations, as you pointed out, but can typically
generate as good code.

Inline asm has its own issues---portability, difficult to use
constraints (the EFLAGS asms went back and forth on its constraints
over the years), and, from what I remember, GNU's inline asm is
closely tied to its register allocator, making it hard to support in
other compilers. Plus, there's not guarantee that an inline asm won't
be moved. (It's been discussed before, and I also believe that the
current Linux inline asm "barriers" should prevent this. I just vastly
prefer guaranteed behavior over "it works now". I've migrated many
systems to new compilers (and even compiler versions) and have seen
"well, it worked for the previous compiler" types of bugs that were
actual bugs in the programmer's code.)

Okay, none of what I said is going to convince you to use builtins,
and I'm not suggesting we do. It's just a few of the reasons why we
tend to prefer them to other methods. Just take this as an FYI. :-)

-bw