Re: [GIT PULL] bitmap changes for v6.2-rc1

From: Linus Torvalds
Date: Fri Dec 23 2022 - 13:44:32 EST


On Thu, Dec 15, 2022 at 3:11 PM Yury Norov <yury.norov@xxxxxxxxx> wrote:
>
> Please pull bitmap patches for v6.2. They spent in -next for more than
> a week without any issues. The branch consists of:

So I've been holding off on this because these bitmap pulls have
always scared me, and I wanted to have the time to actually look
through them in detail before pulling.

I'm back home, over the travel chaos, and while I have other pulls
pending, they seem to be benign fixes so I started looking at this.

And when looking at it, I did indeed finx what I think is a
fundamental arithmetic bug.

That small_const_nbits_off() is simply buggy.

Try this:

small_const_nbits_off(64,-1);

and see it return true.

The thing is, doing divides in C is something you need to be very
careful about, because they do *not* behave nicely with signed values.
They do the "mathematically intuitive" thing of truncating towards
zero, but when you are talking about bit ranges like this, that is
*NOT* what you want.

The comment is fairly good, but it just doesn't match the code.

If you want to check that 'nbits-1' and 'off' are in the same word,
you simply should not use division at all. Sure, it would work, if you
verify that they are both non-negative, but the fact is, even then
it's just wrong, wrong, wrong. You want a shift operation or a masking
operation.

Honestly, in this case, I think the logical thing to do is "check that
the upper bits are the same". The way you do that is probably
something like

!((off) ^ ((nbits)-1) & ~(BITS_PER_LONG-1))

which doesn't have the fundamental bug that the divide version has.

So anyway, I won't be pulling this. I appreciate that you are trying
to micro-optimize the bitop functions, but this is not the first time
your pull requests have made me worried and caused me to not pull.
This is such basic functionality that I really need these pull
requests to be less worrisome.

In other words, not only don't I want to see these kinds of bugs -
please make sure that there isn't anythign that looks even *remotely*
scary. The micro-optimizations had better be *so* obvious and so well
thought through that I not only don't find bugs in them, I don't even
get nervous about finding bugs.

Side note: that whole small_const_nbits_off() thing could be possibly
improved in other ways. It's actually unnecessarily strict (if it was
correct, that is): we don't really require 'off' to be constant, what
we *do* want is

- "nbits > off" needs to be a compile-time true constant

- that "off is in the same word as nbits-1" also needs to be a
compile-time true constant.

and the compiler could determine both of those to be the case even if
'off' itself isn't a constant, if there is code around it that
verifies it.

For example, if you have code like this:

off = a & 7;
n = find_next_bit(bitmap, 64, off);

then the compiler could still see that it's that "last word only"
case, even though 'off' isn't actually a constant value.

So you could try using a helper macro like

#define COMPILE_TIME_TRUE(x) (__builtin_constant_p(x) && (x))

to handle those kinds of things. But again - the code needs to be
OBVIOUSLY CORRECT. Make me feel those warm and fuzzies about it
because it's so obviously safe and correct.

That said, I see your test-cases for your micro-optimizations, but I'd
also like to see references to where it actually improves real code.

I know for a fact that 'small_const_nbits()' actually triggers in real
life. I don't see that your 'small_const_nbits_off()' would trigger in
real life in any situation that isn't already covered by
'small_const_nbits()'.

So convince me not only that the optimizations are obviously correct,
but also that they actually matter.

Linus