Re: perf events ring buffer memory barrier on powerpc

From: Mikulas Patocka
Date: Fri May 09 2014 - 08:20:56 EST




On Fri, 9 May 2014, Victor Kaplansky wrote:

> Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote on 05/08/2014 11:46:53 PM:
>
> > > > BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> > > > around
> > > > @head read.
> > >
> > > Agreed, the ACCESS_ONCE() around tail is superfluous since we're the one
> > > updating tail, so there's no problem with the value changing
> > > unexpectedly.
> >
> > You need ACCESS_ONCE even if you are the only process writing the value.
> > Because without ACCESS_ONCE, the compiler may perform store tearing and
> > split the store into several smaller stores. Search the file
> > "Documentation/memory-barriers.txt" for the term "store tearing", it shows
> > an example where one instruction storing 32-bit value may be split to two
> > instructions, each storing 16-bit value.
> >
> > Mikulas
>
> AFAIR, I was talking about redundant ACCESS_ONCE() around @tail *read* in
> consumer code. As for ACCESS_ONCE() around @tail write in consumer code,
> I see your point, but I don't think that volatile imposed by ACCESS_ONCE()
> is appropriate, since:
>
>     - compiler can generate several stores despite volatile if @tail
>     is bigger in size than native machine data size, e.g. 64-bit on
>     a 32-bit CPU.

That's true - so you should define data_head and data_tail as "unsigned
long", not "__u64".

>     - volatile imposed by ACCESS_ONCE() does nothing to prevent CPU from
>     reordering, splitting or merging accesses. It can only mediate
>     communication problems between processes running on same CPU.

That's why you need smp barrier in addition to ACCESS_ONCE. You need both
- the smp barrier (to prevent the CPU from reordering) and ACCESS_ONCE (to
prevent the compiler from splitting the write to smaller memory accesses).


Since Linux 3.14, there are new macros smp_store_release and
smp_load_acquire that combine ACCESS_ONCE and memory barrier, so you can
use them. (they call compiletime_assert_atomic_type to make sure that you
don't use them on types that are not atomic, such as long long on 32-bit
architectures)

> What you really want is to guarantee *atomicity* of @tail write on consumer
> side.
>
> -- Victor

Mikulas