Re: [PATCH 0/2] [GIT PULL] tracing: various bug fixes

From: Frederic Weisbecker
Date: Wed Apr 22 2009 - 13:11:13 EST


On Wed, Apr 22, 2009 at 09:49:14AM -0400, Steven Rostedt wrote:
>
>
>
> On Wed, 22 Apr 2009, Frederic Weisbecker wrote:
> > >
> > > I spent the entire day (and half the night) debugging this. I was fighting
> > > a case where the hardirqs_enabled flag in the task struct (lockdep flag)
> > > was mysteriously being set and cleared. I stepped through the entire
> > > kernel thread fork process (that was an exercise) and could not find
> > > anything wrong.
> > >
> > > Sometimes it would go away with printk's sometimes it would not. This was
> > > driving me crazy, until I noticed that paravirt was enabled.
> > >
> > > Turning off paravirtualization here (so far) makes everything run
> > > smoothly.
> > >
> > > Thus my theory is that there's something fishy with the modifying of the
> > > irq enable/disable code when the system detects that it is running on bare
> > > hardware.
> > >
> > > I'm too tired to look at this more. Ingo supplied a config to play with.
> > > You can disable VSMP too and it will still trigger the crash.
> > >
> > > -- Steve
> > >
> >
> > It's indeed a tricky one. I can reproduce it too, I will
> > try to manage having an irqsoff trace at this point, hopefully I
> > could get the source of this irq disabling...
>
> It doesn't disable interrupts :-/
>
> It is the hardirqs_enabled flag in the task struct that mysteriously turns
> off and back on. I put in printks when it is off in fork, and the next
> printk shows that it turns back on (between the printks!!!).
>
> I printed the output of "irqs_disabled()" on each of these printks and
> interrupts are always enabled. It is only the hardirqs_enabled flag that
> is giving strange outputs.


Oh, weird...


> Do you have CONFIG_PARAVIRT on? When I disabled it, I have yet to
> reproduce the bug. But I've only rebooted a few times. I'm going to
> continue to reboot to see if I can trigger it.


Yes it is enabled.



> I'm thinking that the paravirt alternative code may have clobbered a
> register in either the enable or disabling of interrupts. This might cause
> a strange value to go into the hardirqs_enabled flag.



Ok I will try it without PARAVIRT and tell you if I can reproduce it.



> Thanks,
>
> -- Steve
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/