Re: [PATCH] workqueue: Fix memory ordering race in queue_work*()

From: Tejun Heo
Date: Tue Aug 16 2022 - 13:01:29 EST


Hello, Will.

On Tue, Aug 16, 2022 at 02:41:57PM +0100, Will Deacon wrote:
> /**
> * test_and_set_bit - Set a bit and return its old value
> * @nr: Bit to set
> * @addr: Address to count from
> *
> * This operation is atomic and cannot be reordered.
> * It may be reordered on other architectures than x86.
> * It also implies a memory barrier.
> */
>
> so while Peter and I were trying to improve the documentation for
> atomics and memory barriers we clearly ended up making the wrong call
> trying to treat this like e.g. a cmpxchg() (which has the
> unordered-on-failure semantics).

I think the doc can be improved here. atomic_t.txt says under ORDERING:

- RMW operations that have a return value are fully ordered;

- RMW operations that are conditional are unordered on FAILURE,
otherwise the above rules apply.

But nothing spells out what's conditional. Maybe it's okay to expect people
to read this doc and extrapolate how it applies, but I think it'd be better
if we spell out clearly per operaiton so that readers can search for a
speicific operation and then follow what the rules are from there.

It bothers me that there doesn't seem to be a comprehensive
operation-indexed doc on the subject. memory-barrier.txt doesn't cover which
operations do what barriers. atomic_t.txt and atomic_bitops.txt cover the
atomic_t operations but not in a comprehensive or searchable manner (e.g.
does test_and_set_bit() return 0 or 1 on success?) and it's not clear where
to look for non-atomic_t atomic operations. I guess it's implied that they
follow the same rules as atomic_t counterparts but I can't seem to find that
spelled out anywhere. The source code docs are layered and dispersed for
generic and arch implemetnations making them difficult to follow.

It'd be awesome if the documentation situation can be improved.

Thanks.

--
tejun