Re: [PATCH] a local-timer-free version of RCU

From: Paul E. McKenney
Date: Tue Nov 16 2010 - 10:51:16 EST


On Tue, Nov 16, 2010 at 02:52:34PM +0100, Frederic Weisbecker wrote:
> On Mon, Nov 15, 2010 at 05:28:46PM -0800, Paul E. McKenney wrote:
> > My concern is not the tick -- it is really easy to work around lack of a
> > tick from an RCU viewpoint. In fact, this happens automatically given the
> > current implementations! If there is a callback anywhere in the system,
> > then RCU will prevent the corresponding CPU from entering dyntick-idle
> > mode, and that CPU's clock will drive the rest of RCU as needed via
> > force_quiescent_state().
>
> Now, I'm confused, I thought a CPU entering idle nohz had nothing to do
> if it has no local callbacks, and rcu_enter_nohz already deals with
> everything.
>
> There is certainly tons of subtle things in RCU anyway :)

Well, I wasn't being all that clear above, apologies!!!

If a given CPU hasn't responded to the current RCU grace period,
perhaps due to being in a longer-than-average irq handler, then it
doesn't necessarily need its own scheduler tick enabled. If there is a
callback anywhere else in the system, then there is some other CPU with
its scheduler tick enabled. That other CPU can drive the slow-to-respond
CPU through the grace-period process.

The current RCU code should work in the common case. There are probably
a few bugs, but I will make you a deal. You find them, I will fix them.
Particularly if you are willing to test the fixes.

> > The force_quiescent_state() workings would
> > want to be slightly different for dyntick-hpc, but not significantly so
> > (especially once I get TREE_RCU moved to kthreads).
> >
> > My concern is rather all the implicit RCU-sched read-side critical
> > sections, particularly those that arch-specific code is creating.
> > And it recently occurred to me that there are necessarily more implicit
> > irq/preempt disables than there are exception entries.
>
> Doh! You're right, I don't know why I thought that adaptive tick would
> solve the implicit rcu sched/bh cases, my vision took a shortcut.

Yeah, and I was clearly suffering from a bit of sleep deprivation when
we discussed this in Boston. :-/

> > So would you be OK with telling RCU about kernel entries/exits, but
> > simply not enabling the tick?
>
> Let's try that.

Cool!!!

> > The irq and NMI kernel entries/exits are
> > already covered, of course.
>
> Yep.
>
> > This seems to me to work out as follows:
> >
> > 1. If there are no RCU callbacks anywhere in the system, RCU
> > is quiescent and does not cause any IPIs or interrupts of
> > any kind. For HPC workloads, this should be the common case.
>
> Right.
>
> > 2. If there is an RCU callback, then one CPU keeps a tick going
> > and drives RCU core processing on all CPUs. (This probably
> > works with RCU as is, but somewhat painfully.) This results
> > in some IPIs, but only to those CPUs that remain running in
> > the kernel for extended time periods. Appropriate adjustment
> > of RCU_JIFFIES_TILL_FORCE_QS, possibly promoted to be a
> > kernel configuration parameter, should make such IPIs
> > -extremely- rare. After all, how many kernel code paths
> > are going to consume (say) 10 jiffies of CPU time? (Keep
> > in mind that if the system call blocks, the CPU will enter
> > dyntick-idle mode, and RCU will still recognize it as an
> > innocent bystander without needing to IPI it.)
>
> Makes all sense. Also there may be periods when these "isolated" CPUs
> will restart the tick, like when there is more than one task running
> on that CPU, in which case we can of course fall back to usual
> grace periods processing.

Yep!

> > 3. The implicit RCU-sched read-side critical sections just work
> > as they do today.
> >
> > Or am I missing some other problems with this approach?
>
> No, looks good, now I'm going to implement/test a draft of these ideas.
>
> Thanks a lot!

Very cool, and thank you!!! I am sure that you will not be shy about
letting me know of any RCU problems that you might encounter. ;-)

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/