Re: [PATCH tip/core/rcu] Do not keep timekeeping CPU tick running for non-nohz_full= CPUs

From: Paul E. McKenney
Date: Mon Jul 21 2014 - 13:33:18 EST


On Mon, Jul 21, 2014 at 07:04:59PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 21, 2014 at 08:57:41AM -0700, Paul E. McKenney wrote:
> > On Sun, Jul 20, 2014 at 10:34:17PM +0200, Peter Zijlstra wrote:
> > > On Sun, Jul 20, 2014 at 04:47:59AM -0700, Paul E. McKenney wrote:
> > > > So we really have to have -all- the CPUs be idle to turn off the timekeeper.
> > >
> > > That seems to be pretty unavoidable any which way around.
> >
> > Hmmm... The exception would be the likely common case where none of
> > the CPUs are flagged as nohz_full= CPUs. If we handled that case as
> > if CONFIG_NO_HZ_FULL=n, we would have handled almost all of
> > the problem.
>
> You mean that is not currently the case? Yes that seems like a fairly
> sane thing to do.

Hard to say -- need to see where Frederic is putting the call to
rcu_sys_is_idle(). On the RCU side, I could potentially lower overhead
by checking tick_nohz_full_enabled() in a few functions.

> > > > This won't make the battery-powered embedded guys happy...
> > > >
> > > > Other thoughts on this? We really should not be setting
> > > > CONFIG_NO_HZ_FULL_SYSIDLE by default until this is solved.
> > >
> > > What are those same guys doing with nohz_full to begin with?
> >
> > If CONFIG_NO_HZ_FULL_SYSIDLE=y is the default, my main concern is for
> > people who didn't really want it, and who thus did not set the nohz_full=
> > boot parameter. Hence my suggestion above that we treat that case as
> > if CONFIG_NO_HZ_FULL=n (and thus also as if CONFIG_NO_HZ_FULL_SYSIDLE=n).
>
> ack
>
> > There have been some people saying that they want only a subset of
> > their CPUs in nohz_full= state, and these guys seem to want to run a
> > mixed workload. For example, they have HPC (or RT) workloads on the
> > nohz_full= CPUs, and also want normal high-throughput processing on the
> > remaining CPUs. If software was trivial (and making other unlikely
> > assumptions about the perfection of the world and the invalidity of
> > Murphy's lawy), we would want the timekeeping CPU to be able to move
> > among the non-nohz_full= CPUs.
>
> Yeah, I don't see a problem with that, but then I'm not entirely sure
> why we use RCU to track system idle state.

Because RCU needs to do very similar tracking to deal with dyntick-idle
CPUs and the various types of RCU grace periods.

> > However, this should be a small fraction of the users, and many of
> > these guys would probably be open to making a few changes. Thus, a
> > less-proactive approach should allow us to solve their actual problems, as
> > opposed to the problems that we speculate that they might encounter. ;-)
>
> But you still haven't talked about the battery people... I don't think
> nohz_full is something they should care about / use.

For all I know, they might care, but it is all speculative at this point.
The possible use cases would be if they were needing some HPC-style
computations for some misbegotten mobile implementation of some
misbegotten game.

So as far as I know at this point, the common case for the battery-powered
guys is that they don't want unconditional scheduling-clock interrupts
on CPU 0 when CPU 0 is idle, and that case is covered by our discussion
above.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/