Re: [RFC PATCH] introduce sys_membarrier(): process-wide memorybarrier (v3b)

From: Paul E. McKenney
Date: Tue Jan 12 2010 - 19:23:52 EST


On Tue, Jan 12, 2010 at 01:56:41PM -0500, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > On Tue, Jan 12, 2010 at 10:38:54AM -0500, Mathieu Desnoyers wrote:
> > > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > > > On Sun, Jan 10, 2010 at 11:30:16PM -0500, Mathieu Desnoyers wrote:
> > > > > Here is an implementation of a new system call, sys_membarrier(), which
> > > > > executes a memory barrier on all threads of the current process.
> > > > >
> > > > > It aims at greatly simplifying and enhancing the current signal-based
> > > > > liburcu userspace RCU synchronize_rcu() implementation.
> > > > > (found at http://lttng.org/urcu)
> > > >
> > > > I didn't expect quite this comprehensive of an implementation from the
> > > > outset, but I guess I cannot complain. ;-)
> > > >
> > > > Overall, good stuff.
> > > >
> > > > Interestingly enough, what you have implemented is analogous to
> > > > synchronize_rcu_expedited() and friends that have recently been added
> > > > to the in-kernel RCU API. By this analogy, my earlier semi-suggestion
> > > > of synchronize_rcu(0 would be a candidate non-expedited implementation.
> > > > Long latency, but extremely low CPU consumption, full batching of
> > > > concurrent requests (even unrelated ones), and so on.
> > >
> > > Yes, the main different I think is that the sys_membarrier
> > > infrastructure focuses on IPI-ing only the current process running
> > > threads.
> >
> > Which does indeed make sense for the expedited interface. On the other
> > hand, if you have a bunch of concurrent non-expedited requests from
> > different processes, covering all CPUs efficiently satisfies all of
> > the requests in one go. And, if you use synchronize_sched() for the
> > non-expedited case, there will be no IPIs in the common case.
>
> So are you proposing we add a "int expedited" parameter to the
> system call, and let the caller choose between the ipi and
> synchronize_sched() schemes ?

Sure, why not?

> [...]
> > > > Also, is "top"
> > > > accurate given that the IPI handler will have interrupts disabled?
> > >
> > > Probably not. AFAIK. "top" does not really consider interrupts into its
> > > accounting. So, better take this top output with a grain of salt or two.
> >
> > Might need something like oprofile to get good info?
>
> Could be. Although I just wanted to point out the kind of pattern we
> should expect. I'm not convinced it's so useful to give the detailed
> oprofile info. I'm rephrasing the above paragraph to state that top is
> not super-accurate here.

K.

Thanx, Paul

> [...]
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/