Re: [GIT PULL] pipe: nonblocking rw for io_uring

From: Linus Torvalds
Date: Tue Apr 25 2023 - 13:21:02 EST


On Mon, Apr 24, 2023 at 8:16 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> That said, you should move the
>
> old_fmode = READ_ONCE(file->f_mode);
>
> to outside the loop, because try_cmpxchg() will then update
> 'old_fmode' to the proper value and it should *not* be done again.
>
> I also suspect that the READ_ONCE() itself is entirely superfluous,
> because that is very much a value that we should let the compiler
> optimize to *not* have to access more than it needs to.

I ended up looking around a bit, and the READ_ONCE() before the
"try_cmpxchg()" loop is definitely our common pattern.

However, I'm adding in the locking people, because I think that
pattern is wrong and historical.

I think it comes from the original cmpxchg loop model, where we did
the read inside the loop, and the READ_ONCE() was some kind of "let's
make sure it's updated every time" thing (which should be unnecessary
due to memory clobbers on the cmpxchg, but whatever...

Inside the loop, the READ_ONCE() also makes more sense in general, in
that there isn't any sane point in merging it with earlier values, so
there's no downside.

But the more I look at it, the more I'm convinced that our pattern of

old = READ_ONCE(rq->fence.error);
do {
if (fatal_error(old))
return false;
} while (!try_cmpxchg(&rq->fence.error, &old, error));

(to pick one random user) is simply horribly wrong.

If we have code where we had an earlier load of that same value (or an
earlier store, for that matter - anything where the compiler can
assume some starting value), then the READ_ONCE() only adds pointless
overhead.

And if we don't have it, then the READ_ONCE() doesn't add any value,
since it doesn't imply any ordering.

IOW, I think the READ_ONCE() pattern is either pointless or
detrimental. I see no upside.

So I think we should make the pattern be used with a try_cmpxchg()
loop to be either a proper *ordered* read (ie something like
"smp_load_acquire()" for reading the original value), or we should
just let the compiler read it any which way it wants. Not the
READ_ONCE() pattern we seem to have.

But I'm adding the locking maintainers to this email to make sure that
I'm not missing anything.

PeterZ in particular, who added that new (and hugely improved!)
try_cmpxchg() interface in the first place.

Linus