On Wed, Apr 22, 2009 at 09:49:14AM -0400, Steven Rostedt wrote:
On Wed, 22 Apr 2009, Frederic Weisbecker wrote:
It doesn't disable interrupts :-/I spent the entire day (and half the night) debugging this. I was fighting a case where the hardirqs_enabled flag in the task struct (lockdep flag) was mysteriously being set and cleared. I stepped through the entire kernel thread fork process (that was an exercise) and could not find anything wrong.It's indeed a tricky one. I can reproduce it too, I will
Sometimes it would go away with printk's sometimes it would not. This was driving me crazy, until I noticed that paravirt was enabled.
Turning off paravirtualization here (so far) makes everything run smoothly.
Thus my theory is that there's something fishy with the modifying of the irq enable/disable code when the system detects that it is running on bare hardware.
I'm too tired to look at this more. Ingo supplied a config to play with. You can disable VSMP too and it will still trigger the crash.
-- Steve
try to manage having an irqsoff trace at this point, hopefully I
could get the source of this irq disabling...
It is the hardirqs_enabled flag in the task struct that mysteriously turns off and back on. I put in printks when it is off in fork, and the next printk shows that it turns back on (between the printks!!!).
I printed the output of "irqs_disabled()" on each of these printks and interrupts are always enabled. It is only the hardirqs_enabled flag that is giving strange outputs.
Oh, weird...
Do you have CONFIG_PARAVIRT on? When I disabled it, I have yet to reproduce the bug. But I've only rebooted a few times. I'm going to continue to reboot to see if I can trigger it.
Yes it is enabled.
I'm thinking that the paravirt alternative code may have clobbered a register in either the enable or disabling of interrupts. This might cause a strange value to go into the hardirqs_enabled flag.
Ok I will try it without PARAVIRT and tell you if I can reproduce it.