[PATCH v4 0/5] bitops: optimize code and add tests

From: Vincent Mailhol
Date: Sun Jan 28 2024 - 00:06:14 EST


This series adds a compile test to make sure that all the bitops
operations (namely __ffs(), ffs(), ffz(), __fls(), fls(), fls64())
correctly fold constant expressions given that their argument is also
a constant expression. The other functions from bitops.h are out of
scope.

So far, only the n68k and the hexagon architectures lack such
optimization. To this extend, the first two patches optimize m68k
architecture, the third and fourth optimize the hexagon architecture
bitops function.

The fifth and final patch adds the compile time tests to assert that
the constant folding occurs and that the result is accurate.

This is tested on arm, arm64, hexagon, m68k, x86 and x86_64. For other
architectures, I am putting my trust into the kernel test robot to
send a report if ever one of these still lacks bitops
optimizations. The kernel test robot did not complain on v3, giving me
confidence that all architectures are now properly optimized.
---

** Changelog **

v3 -> v4:

- Only apply the __always_inline to the bit-find functions, do not
touch other functions from bitops.h. I discovered that the
benchmark done in the v3 was incorrect (refer to the thread for
details). The scope was thus narrowed down to the bit-find
functions for which I could demonstrate the gain in the benchmark.

- Add benchmark for hexagon (patch 3/5 and 4/5). Contrarily to the
m68k benchmark which is with an allyesconfig, the hexagon
benchmark uses a defconfig. The reason is just that the
allyesconfig did not work on first try on my environment (even
before applying this series), and I did not spent efforts to
troubleshoot.

- Add Geert review tag in patch 2/5. Despite also receiving the tag
for patch 1/5, I did not apply due to new changes in that patch.

- Do not split the lines containing tags.

Link: https://lore.kernel.org/all/20231217071250.892867-1-mailhol.vincent@xxxxxxxxxx/

v2 -> v3:

- Add patches 1/5 and 2/5 to optimize m68k architecture bitops.
Thanks to the kernel test robot for reporting!

- Add patches 3/5 and 4/5 to optimize hexagon architecture bitops.
Thanks to the kernel test robot for reporting!

- Patch 5/5: mark test_bitops_const_eval() as __always_inline, this
done, pass n (the test number) as a parameter. Previously, only
BITS(10) was tested. Add tests for BITS(0) and BITS(31).

Link: https://lore.kernel.org/all/20231130102717.1297492-1-mailhol.vincent@xxxxxxxxxx/

v1 -> v2:

- Drop the RFC patch. v1 was not ready to be applied on x86 because
of pending changes in arch/x86/include/asm/bitops.h. This was
finally fixed by Nick in commit 3dae5c43badf ("x86/asm/bitops: Use
__builtin_clz{l|ll} to evaluate constant expressions").
Thanks Nick!

- Update the commit description.

- Introduce the test_const_eval() macro to factorize code.

- No functional change.

Link: https://lore.kernel.org/all/20221111081316.30373-1-mailhol.vincent@xxxxxxxxxx/

Vincent Mailhol (5):
m68k/bitops: force inlining of all bit-find functions
m68k/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant
expressions
hexagon/bitops: force inlining of all bit-find functions
hexagon/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant
expressions
lib: test_bitops: add compile-time optimization/evaluations assertions

arch/hexagon/include/asm/bitops.h | 25 +++++++++++++++++++-----
arch/m68k/include/asm/bitops.h | 26 ++++++++++++++++++-------
lib/Kconfig.debug | 4 ++++
lib/test_bitops.c | 32 +++++++++++++++++++++++++++++++
4 files changed, 75 insertions(+), 12 deletions(-)

--
2.43.0