Re: [PATCH] rcu: Only pin GP kthread when full dynticks is actually used

From: Frederic Weisbecker
Date: Fri Jun 13 2014 - 12:00:16 EST


On Fri, Jun 13, 2014 at 08:52:33AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 13, 2014 at 02:47:16PM +0200, Frederic Weisbecker wrote:
> > On Thu, Jun 12, 2014 at 06:35:15PM -0700, Paul E. McKenney wrote:
> > > On Thu, Jun 12, 2014 at 06:24:32PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 13, 2014 at 02:16:59AM +0200, Frederic Weisbecker wrote:
> > > > > CONFIG_NO_HZ_FULL may be enabled widely on distros nowadays but actual
> > > > > users should be a tiny minority, if actually any.
> > > > >
> > > > > Also there is a risk that affining the GP kthread to a single CPU could
> > > > > end up noticeably reducing RCU performances and increasing energy
> > > > > consumption.
> > > > >
> > > > > So lets affine the GP kthread only when nohz full is actually used
> > > > > (ie: when the nohz_full= parameter is filled or CONFIG_NO_HZ_FULL_ALL=y)
> > >
> > > Which reminds me... Kernel-heavy workloads running NO_HZ_FULL_ALL=y
> > > can see long RCU grace periods, as in about two seconds each. It is
> > > not hard for me to detect this situation.
> >
> > Ah yeah sounds quite long.
> >
> > > Is there some way I can
> > > call for a given CPU's scheduling-clock interrupt to be turned on?
> >
> > Yeah, once the nohz kick patchset (https://lwn.net/Articles/601214/) is merged,
> > a simple call to tick_nohz_full_kick_cpu() should do the trick. Although the
> > right condition must be there on the IPI side. Maybe with rcu_needs_cpu() or such.
>
> I could record the offending GP, and make rcu_needs_cpu() return true
> if the current GP matches the offending one.
>
> > But it would be interesting to identify the sources of these extended grace periods.
> > If we only restart the tick, we may ignore some deeper oustanding issue.
>
> Some of them have been fixable by other means, but they will probably
> come back as system sizes grow. And I really have put preemption points
> into kernel code in response to RCU CPU stall warnings, and the current
> state of NO_HZ_FULL effectively ignores these preemption points. :-/

I'm not sure I really understand the issue though. So you have RCU CPU stalls due
to very extended grace periods, right?

I'm not sure how preemption points would solve that. Or maybe you're
trying to trigger quiescent states reports through these preemption points?

Is it because we have dynticks CPUs staying too long in the kernel without
taking any quiescent states? Are we perhaps missing some rcu_user_enter() or
things?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/