Re: Problems with 2.6.17-rt8

From: Steven Rostedt
Date: Thu Aug 03 2006 - 16:52:11 EST


On Thu, 2006-08-03 at 13:22 -0700, Bill Huey wrote:
> On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:
> > The rest was probably caused as a side effect from above. The above is
> > already broken!
> >
> > You have NUMA configured too, so this is also something to look at.
> >
> > I still wouldn't ignore the first bug message you got:
> >
> > ----
> > BUG: scheduling while atomic: udev_run_devd/0x00000001/1568
> >
> > Call Trace:
> > <ffffffff8045c693>{__schedule+155}
> > <ffffffff8045f156>{_raw_spin_unlock_irqrestore+53}
> > <ffffffff80242241>{task_blocks_on_rt_mutex+518}
> > <ffffffff80252da0>{free_pages_bulk+39}
> > <ffffffff80252da0>{free_pages_bulk+39}
> > ...
> > ----
> >
> > This could also have a side effect that messes things up.
> >
> > Unfortunately, right now I'm assigned to other tasks and I cant spend
> > much more time on this at the moment. So hopefully, Ingo, Thomas or
> > Bill, or someone else can help you find the reason for this problem.
>
> free_pages_bulk is definitely being called inside of an atomic.
> I force this stack trace when the in_atomic() test is true at the
> beginning of the function.
>
>
> [ 29.362863] Call Trace:
> [ 29.367107] <ffffffff802a82ac>{free_pages_bulk+86}
> [ 29.373122] <ffffffff80261726>{_raw_spin_unlock_irqrestore+44}
> [ 29.380233] <ffffffff802a8778>{__free_pages_ok+428}
> [ 29.386336] <ffffffff8024f101>{free_hot_page+25}
> [ 29.392165] <ffffffff8022e298>{__free_pages+41}
> [ 29.397898] <ffffffff806b604d>{__free_pages_bootmem+174}
> [ 29.404457] <ffffffff806b5266>{free_all_bootmem_core+253}
> [ 29.411112] <ffffffff806b5340>{free_all_bootmem_node+9}
> [ 29.417574] <ffffffff806b254e>{numa_free_all_bootmem+61}
> [ 29.424122] <ffffffff8046e96e>{_etext+0}
> [ 29.429224] <ffffffff806b1392>{mem_init+128}
> [ 29.434691] <ffffffff806a17ab>{start_kernel+377}
> [ 29.440520] <ffffffff806a129b>{_sinittext+667}
> [ 29.446669] ---------------------------
> [ 29.450963] | preempt count: 00000001 ]
> [ 29.455257] | 1-level deep critical section nesting:
> [ 29.460732] ----------------------------------------
> [ 29.466212] .. [<ffffffff806a169a>] .... start_kernel+0x68/0x221
> [ 29.472815] .....[<ffffffff806a129b>] .. ( <= _sinittext+0x29b/0x2a2)
> [ 29.480056]

Perhaps you could put in that in_atomic check at the start of each of
these functions and point to where it is a problem. Perhaps a spinlock
is taken that was real and not a mutex.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/