Re: 2.6.9-rc2-mm4 ps hang ?

From: Badari Pulavarty
Date: Mon Oct 04 2004 - 10:58:53 EST


On Fri, 2004-10-01 at 17:44, Andrew Morton wrote:
> Badari Pulavarty <pbadari@xxxxxxxxxx> wrote:
> >
> >
> > On Fri, 2004-10-01 at 16:49, Andrew Morton wrote:
> > > Badari Pulavarty <pbadari@xxxxxxxxxx> wrote:
> > > >
> > > > Here is the full sysrq-t output.
> > >
> > > What's this guy up to?
> > ...
> > >
> > > Something is seriously screwed up if it's stuck in try_to_wake_up(). Tried
> > > generating a few extra traces?
> > >
> > > Then again, maybe we're missing an up_read() somewhere. hrm, I'll check.
> > >
> >
> > It reproduced again. I think this is the one causing all the troubles..
> >
> > db2fmcd D ffffffff80132e2a 0 10854 7636
> > (NOTLB)
> > 00000101bae85ef8 0000000000000002 0000020800000018 00000101d9ddd550
> > 0000000100000084 000001016d490e20 000001016d491158
> > 00000101d9ddd550
> > 0000000000000206 ffffffff801353cb
> > Call Trace:<ffffffff801353cb>{try_to_wake_up+971}
> > <ffffffff804455f0>{__down_write+128}
> > <ffffffff80125e6f>{sys32_mmap+143}
> > <ffffffff80124af1>{ia32_sysret+0}
> >
> > when I tried looking at /proc/10854 - it hung.
>
> OK, so maybe that CPU is spinning in try_to_wake_up(). Can you tell if one
> CPU is busy when this happens?

CPU is not busy when this happens. Process is "D".
I traced few times when this happens, its always stuck there only.


>
> Or you could try Peter's suggestion:
>
> --- 25/kernel/sched.c~a 2004-10-01 17:43:32.500700488 -0700
> +++ 25-akpm/kernel/sched.c 2004-10-01 17:43:34.754357880 -0700
> @@ -1606,7 +1606,7 @@ out_set_cpu:
> task_rq_unlock(rql, &flags);
> /* might preempt at this point */
> rql = task_rq_lock(p, &flags);
> - adjust_sched_timestamp(p, old_rq);
> +// adjust_sched_timestamp(p, old_rq);
> old_state = p->state;
> if (!(old_state & state))
> goto out;

I will try this patch. Only problem is it happens very random.

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/