Re: [PATCH] ia64 cpuset + build_sched_domains() mangles structures

From: Dinakar Guniguntala
Date: Mon Aug 22 2005 - 15:16:37 EST


On Tue, Aug 23, 2005 at 01:46:26AM +0530, Dinakar Guniguntala wrote:
> On Mon, Aug 22, 2005 at 06:07:19PM +0200, Ingo Molnar wrote:
> > great! Andrew, i'd suggest we try the merged patch attached below in
> > -mm.
> >
>
> Ingo, unfortunately I am hitting panic's on stress testing. The panic
> screen is attached in the .png below.

Sorry, forgot to add the .png. Here it is...

>
> On debugging I found that the panic happens consistently in this line
> of code in function find_busiest_group
>
> *imbalance = min((max_load - avg_load) * busiest->cpu_power,
> (avg_load - this_load) * this->cpu_power)
> / SCHED_LOAD_SCALE;
>
> Here I find that the "this" pointer is still NULL. I verified this by
> a quick hack as below in the same function and with this hack it seems
> to run for hours
>
> - if (!busiest || this_load >= max_load)
> + if (!this || !busiest || this_load >= max_load)
>
> This can only happen if the none of the sched groups pointed to by the
> 'sd' of the current cpu contain the current cpu. I was wondering if
> this had anything to do with the way that we are using RCU to assign/
> read the 'sd' pointer.
>
> Any thoughts ??
>
> -Dinakar
>

Attachment: sd-panic.png
Description: PNG image