Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression

From: josh
Date: Fri Jun 20 2014 - 19:52:30 EST


On Fri, Jun 20, 2014 at 04:30:33PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 20, 2014 at 03:39:51PM -0700, josh@xxxxxxxxxxxxxxxx wrote:
> > On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 20, 2014 at 02:24:23PM -0700, josh@xxxxxxxxxxxxxxxx wrote:
> > > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote:
> > > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn
> > > > > out to be a good thing, but it doesn't help give RCU the quiescent
> > > > > states that it needs.
> > > >
> > > > What about doing this, together with letting the fqs logic poke
> > > > un-quiesced kernel code as needed? That way, rather than having
> > > > cond_resched do any work, you have the fqs logic recognize that a
> > > > particular CPU has gone too long without quiescing, without disturbing
> > > > that CPU at all if it hasn't gone too long.
> > >
> > > My next stop is to post the previous series, but with a couple of
> > > exports and one bug fix uncovered by testing thus far, but after
> > > another round of testing. Then I am going to take a close look at
> > > this one:
> > >
> > > o Push the checks further into cond_resched(), so that the
> > > fastpath does the same sequence of instructions that the original
> > > did. This might work well, but requires IPIs, which are not so
> > > good for latencies on the remote CPU. It nevertheless might be a
> > > decent long-term solution given that if your CPU is spending many
> > > jiffies looping in the kernel, you aren't getting good latencies
> > > anyway. It also has the benefit of allowing RCU to take advantage
> > > of the implicit quiescent states of all cond_resched() calls,
> > > and of eliminating the need for a separate cond_resched_rcu_qs()
> > > and for RCU_COND_RESCHED_QS.
> > >
> > > The one you call out is of course interesting as well. But there are
> > > a couple of questions:
> > >
> > > 1. Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start
> > > with? It just seems to obvious a thing to do for it to possibly
> > > be an oversight. (What, me paranoid?)
> > >
> > > 2. When RCU recognizes that a particular CPU has gone too long,
> > > exactly what are you suggesting that RCU do about it? When
> > > formulating your answer, please give due consideration to the
> > > implications of that CPU being a NO_HZ_FULL CPU. ;-)
> >
> > Send it an IPI that either causes it to flag a quiescent state
> > immediately if currently quiesced or causes it to quiesce at the next
> > opportunity if not.
>
> OK. But if we are in a !PREEMPT kernel,

That's not the case I was suggesting. *If* the kernel is fully
preemptible, then it makes little sense to put any code in cond_resched,
when instead another thread can simply cause a preemption if it needs a
quiescent state. That has the advantage of not imposing any unnecessary
polling on code running in the kernel.

In a !PREEMPT kernel, it makes a bit more sense to have cond_resched as
a voluntary preemption point. But voluntary preemption points don't
make as much sense in a kernel prepared to preempt a thread anywhere.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/