Re: linux-next 20111025: warnings inrcu_idle_exit_common()/rcu_idle_enter_common()

From: Paul E. McKenney
Date: Mon Oct 31 2011 - 20:35:24 EST


On Mon, Oct 31, 2011 at 11:44:42AM -0400, Steven Rostedt wrote:
> On Mon, 2011-10-31 at 05:19 -0700, Paul E. McKenney wrote:
> > On Mon, Oct 31, 2011 at 07:41:42PM +0800, Wu Fengguang wrote:
> > > On Mon, Oct 31, 2011 at 06:43:25PM +0800, Wu Fengguang wrote:
> > > > On Mon, Oct 31, 2011 at 05:51:52PM +0800, Paul E. McKenney wrote:
> > > > > On Mon, Oct 31, 2011 at 04:26:34PM +0800, Wu Fengguang wrote:
> > > > > > Hi Paul,
> > > > > >
> > > > > > I got two warnings in rcutree.c. The last working kernels are
> > > > > > linux-next 20111014 and linux v3.1.
> > > > >
> > > > > Interesting. Could you please enable RCU event tracing at boot?
> > > >
> > > > Sorry I cannot...possibly due to another ftrace bug.
> > > >
> > > > > The RCU event tracing is at tracing/events/rcu/enable relative to
> > > > > the debugfs mount point at runtime, if that helps.
> > > >
> > > > It's exactly that linux next 20111025 (comparing to 20111014) no
> > > > longer produces all the trace events that made me looking into the
> > > > dmesg and find the warning from RCU (rather than the expected warning
> > > > from ftrace).
> > > >
> > > > The trace output is now:
> > > >
> > > > # tracer: nop
> > > > #
> > > > # WARNING: FUNCTION TRACING IS CORRUPTED
> > > > # MAY BE MISSING FUNCTION EVENTS
> > > > # TASK-PID CPU# TIMESTAMP FUNCTION
> > > > # | | | | |
> > > > (nothing more)
> > >
> > > I checked the other test box and got the same warnings. Below is the
> > > full dmesg.
> > >
> > > No single trace output again..
> >
> > Hmmm... I wonder if it is too early during boot for tracing to work
> > correctly.
> >
> > Gah! I have rcu/next set ahead to commits that are not supposed to go
> > upstream yet. I reset it back to match the stuff that is targeted for
> > the current merge window. Still need to find the bug, of course.
> >
> > Anyone have any idea why the kworker thread might be trying to enter
> > the idle loop? The idle_cpu(smp_processor_id()) call believes that
> > this is not the idle task. Or does x86 allow non-idle tasks to enter
> > the idle loop? Or to be migrated off-CPU?
>
>
> It's not. Carsten Emde noticed what looked like a bug in ftrace last
> week at LinuxCon, and looking deeper at it, I found that the swapper
> task for all but CPU0 is named kworker. That's because kworker creates
> the idle task for all other CPUs besides CPU 0 and the idle task takes
> on kworker name.
>
> Carsten posted a patch last week too:
>
> https://lkml.org/lkml/2011/10/26/313
>
> I'm glad that this bug shows up outside of just ftrace :)

That makes one of us. ;-)

Fengguang, does Carsten's patch help?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/