Re: [GIT PULL] bitmap patches for v6.8

From: Yury Norov
Date: Sun Jan 21 2024 - 18:35:30 EST


On Sun, Jan 21, 2024 at 01:47:21PM -0800, Linus Torvalds wrote:
> So I've left this to be my last pull request, because I hate how our
> header files are growing, and this part:
>
> include/linux/find.h | 301 ++++++++++++++++++++++++++++++-
> 1 file changed, 297 insertions(+), 4 deletions(-)
>
> in particular.
>
> Nobody includes <linux/find.h> directly, but indirectly pretty much
> *every* single kernel C file includes it.
>
> Looking at some basic stats of my dependency files in my tree, 4426 of
> 4526 object files (~98%) depend on find.h because they get it through
> *some* path that ends up with bitmap.h -> find.h.
>
> And honestly, the number of files that actually want the new functions
> is basically just a tiny handful. It's also not obvious how useful
> those optimizations are, considering that a lot of the loops are
> *tiny*. I looked at a few cases, and the size of the bitmap it was
> iterating over was often in the 2-4 range, sometimes (like
> RTW89_TXCH_NUM) 13, etc.
>
> In radio-shark, you replaced a loop like this
>
> for (i = 0; i < 2; i++) {
>
> with that for_each_test_and_clear_bit(), and it *really* isn't clear
> that it was worth it. It sure wasn't performance-critical to begin
> with.
>
> In general, if an "optimization" doesn't have any performance numbers
> attached to it, is it an optimization at all?

No, this is not a performance optimization, and I don't claim that.
Jan Kara reported some performance improvement, but performance is not
the main goal of the series, and I didn't run performance tests for
this on myself.

The original motivation came from the fact that using non-volatile
find_bit() together with volatile test_and_set_bit() may trigger
KCSAN warning on concurrent memory access.

People wanted to make the whole find API volatile, which is a bad idea
for sure. So I had to give them a reasonable alternative. After some
thinking I decided that we just need a separate set of volatile find API.
Check this for initial discussion:

https://lore.kernel.org/lkml/634f5fdf-e236-42cf-be8d-48a581c21660@xxxxxxxxxxxx/T/#m3e7341eb3571753f3acf8fe166f3fb5b2c12e615

It also makes the code cleaner, at least to my taste, and some
reviewers' too. And to some degree less bug-prone. For example,
ILSEL_LEVELS is just 15, but traversing code in ilsel_enable()
is buggy, as Geert spotted. And switching to atomic find() fixes
it automatically:

https://lore.kernel.org/lkml/CAMuHMdWHxesM-EOOMtrrw3Caz+5Wux35QiKOjvwA=vwQpRe26Q@xxxxxxxxxxxxxx/T/#me53217b32fd5623af6c7eafa5c4ce6d0465f6c58

(His review came just a couple days ago, after I submitted the pull
request, so the tag is not there.)

> So I finally ended up pulling this, but after looking at the patch I
> went "this is adding more lines than it removes, has no performance
> numbers, and grows a core header file that is included by absolutely
> everything by a third".
>
> .. and then I decided to just unpull it again.

Yes, it's true. Bitmap.h historically includes everything related to
bitmaps, and it became too big. I started moving some chunks out of
it already. For example, aae06fc1b5a2e splits out string-related bitmap
code to a separate bitmap-str.h. That patch was more about human
readability, and it keeps the order, so that bitmap.h includes
bitmap-str.h, but we can easily turn that around.

I wanted to do it some day, but actually nobody complained before now,
and I wanted to collect more material to make it a series.

I still think that kernel needs atomic find_bit() API. If you change
your mind and pull the series, I can make patches that split-out
atomic find() declarations to a separate header, not included by
bitmap.h, and submit it for -rc2, together with bitmap-str.h rework.

Thanks,
Yury