Re: perf events ring buffer memory barrier on powerpc

From: Paul E. McKenney
Date: Mon Nov 04 2013 - 05:08:02 EST


On Mon, Nov 04, 2013 at 10:07:44AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2013 at 08:20:48AM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 01, 2013 at 11:30:17AM +0100, Peter Zijlstra wrote:
> > > Furthermore there's a gazillion parallel userspace programs.
> >
> > Most of which have very unaggressive concurrency designs.
>
> pthread_mutex_t A, B;
>
> char data_A[x];
> int counter_B = 1;
>
> void funA(void)
> {
> pthread_mutex_lock(&A);
> memset(data_A, 0, sizeof(data_A));
> pthread_mutex_unlock(&A);
> }
>
> void funB(void)
> {
> pthread_mutex_lock(&B);
> counter_B++;
> pthread_mutex_unlock(&B);
> }
>
> void funC(void)
> {
> pthread_mutex_lock(&B)
> printf("%d\n", counter_B);
> pthread_mutex_unlock(&B);
> }
>
> Then run: funA, funB, funC concurrently, and end with a funC.
>
> Then explain to userman than his unaggressive program can return:
> 0
> 1
>
> Because the memset() thought it might be a cute idea to overwrite
> counter_B and fix it up 'later'. Which if I understood you right is
> valid in C/C++ :-(
>
> Not that any actual memset implementation exhibiting this trait wouldn't
> be shot on the spot.

Even without such a malicious memcpy() implementation I must still explain
about false sharing when the developer notices that the unaggressive
program isn't running as fast as expected.

> > > > By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> > > > And thus requiring that all accesses to it be decorated, which in the
> > > > case of RCU could be buried in the RCU accessors.
> > >
> > > This seems contradictory; marking it atomic would look like:
> > >
> > > struct foo {
> > > unsigned long value;
> > > __atomic void *ptr;
> > > unsigned long value1;
> > > };
> > >
> > > Clearly we cannot hide this definition in accessors, because then
> > > accesses to value* won't see the annotation.
> >
> > #define __rcu __atomic
>
> Yeah, except we don't use __rcu all that consistently; in fact I don't
> know if I ever added it.

There are more than 300 of them in the kernel. Plus sparse can be
convinced to yell at you if you don't use them. So lack of __rcu could
be fixed without too much trouble.

The C/C++11 need to annotate functions that take arguments or return
values taken from rcu_dereference() is another story. But the compilers
have to get significantly more aggressive or developers have to be doing
unusual things that result in rcu_dereference() returning something whose
value the compiler can predict exactly.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/