Re: [PATCH 1/7] General notification queue with user mmap()'able ring buffer

From: Peter Zijlstra
Date: Fri May 31 2019 - 04:51:05 EST


On Wed, May 29, 2019 at 05:06:40PM +0100, David Howells wrote:

> Looking at the perf ring buffer, there appears to be a missing barrier in
> perf_aux_output_end():
>
> rb->user_page->aux_head = rb->aux_head;
>
> should be:
>
> smp_store_release(&rb->user_page->aux_head, rb->aux_head);

I've answered that in another email; the aux bit is 'magic'.

> It should also be using smp_load_acquire(). See
> Documentation/core-api/circular-buffers.rst

We use the control dependency instead, as described in the comment of
perf_output_put_handle():

* kernel user
*
* if (LOAD ->data_tail) { LOAD ->data_head
* (A) smp_rmb() (C)
* STORE $data LOAD $data
* smp_wmb() (B) smp_mb() (D)
* STORE ->data_head STORE ->data_tail
* }
*
* Where A pairs with D, and B pairs with C.
*
* In our case (A) is a control dependency that separates the load of
* the ->data_tail and the stores of $data. In case ->data_tail
* indicates there is no room in the buffer to store $data we do not.
*
* D needs to be a full barrier since it separates the data READ
* from the tail WRITE.
*
* For B a WMB is sufficient since it separates two WRITEs, and for C
* an RMB is sufficient since it separates two READs.

Userspace can choose to use smp_load_acquire() over the first smp_rmb()
if that is efficient for the architecture (for w ahole bunch of archs
load-acquire would end up using mb() while rmb() is adequate and
cheaper).