Re: rcu_prempt stalls / lockup

From: Paul E. McKenney
Date: Tue Apr 01 2014 - 14:33:00 EST


On Tue, Apr 01, 2014 at 02:04:14PM -0400, Dave Jones wrote:
> On Tue, Apr 01, 2014 at 10:55:45AM -0700, Paul E. McKenney wrote:
> > > > > so kernel space still works like before, but userspace is locked up.
> > > >
> > > > Interesting. I suspect that if you reverted the rest of this merge
> > > > window's RCU patches, you would get the same result.
>
> Something that occurred to me is that this might be something in the x86 merge
> that's just changing timings enough to expose this problem.
> At some point this evening, I'll try bisecting it if we don't get any closer.

OK. ;-)

> > > [ 1953.672735] INFO: Stall ended before state dump start, gp_kthread state: 0x2
> > > [ 2148.608132] INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > [ 2148.609140] (detected by 0, t=104027 jiffies, g=47728, c=47727, q=0)
> > > etc etc.
> >
> > Waiting uninterruptibly. Presumably blocked on mutex_lock(). But
> > you have CONFIG_PROVE_LOCKING(), so any deadlocks should have been
> > reported.
>
> Lockdep had reported something a little earlier (timestamped at 1108.xxxxxx)
> but that's a known false-positive in xfs.

Yep, I would be very surprised if that was related to the grace-period hang.

> > Given that you have CONFIG_RCU_TRACE=y, could you please enable the
> > following trace events and dump the trace before things hang?
> >
> > trace_event=rcu:rcu_grace_period,rcu:rcu_grace_period_init
> >
> > If it is not feasible to dump the trace before things hang, let me
> > know, and I will work out some other diagnostic regime.
>
> I'll give that a shot when I get back in a few hours.

Cool!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/