Re: [PATCH v4] lib/dlock-list: Scale dlock_lists_empty()

From: Waiman Long
Date: Tue Nov 07 2017 - 13:57:17 EST


On 11/07/2017 12:59 PM, Andreas Dilger wrote:
> On Nov 7, 2017, at 4:59 AM, Jan Kara <jack@xxxxxxx> wrote:
>> On Mon 06-11-17 10:47:08, Davidlohr Bueso wrote:
>>> + /*
>>> + * Serialize dlist->used_lists such that a 0->1 transition is not
>>> + * missed by another thread checking if any of the dlock lists are
>>> + * used.
>>> + *
>>> + * CPU0 CPU1
>>> + * dlock_list_add() dlock_lists_empty()
>>> + * [S] atomic_inc(used_lists);
>>> + * smp_mb__after_atomic();
>>> + * smp_mb__before_atomic();
>>> + * [L] atomic_read(used_lists)
>>> + * list_add()
>>> + */
>>> + smp_mb__before_atomic();
>>> + return !atomic_read(&dlist->used_lists);
> Just a general kernel programming question here - I thought the whole point
> of atomics is that they are, well, atomic across all CPUs so there is no
> need for a memory barrier? If there is a need for a memory barrier for
> each atomic access (assuming it isn't accessed under another lock, which would
> make the use of atomic types pointless, IMHO) then I'd think there is a lot
> of code in the kernel that isn't doing this properly.
>
> What am I missing here?

Atomic update and memory barrier are 2 different things. Atomic update
means other CPUs see either the value before or after the update. They
won't see anything in between. For a counter, it means we won't miss any
counts. However, not all atomic operations give an ordering guarantee.
The atomic_read() and atomic_inc() are examples that do not provide
memory ordering guarantee. See Documentation/memory-barriers.txt for
more information about it.

A CPU can perform atomic operations 1 & 2 in program order, but other
CPUs may see operation 2 first before operation 1. Here memory barrier
can be used to guarantee that other CPUs see the memory updates in
certain order.

Hope this help.
Longman