Re: 2.6.9-rc2-mm4 ps hang ?

From: Andrew Morton
Date: Fri Oct 01 2004 - 19:48:37 EST


Badari Pulavarty <pbadari@xxxxxxxxxx> wrote:
>
>
> On Fri, 2004-10-01 at 16:49, Andrew Morton wrote:
> > Badari Pulavarty <pbadari@xxxxxxxxxx> wrote:
> > >
> > > Here is the full sysrq-t output.
> >
> > What's this guy up to?
> ...
> >
> > Something is seriously screwed up if it's stuck in try_to_wake_up(). Tried
> > generating a few extra traces?
> >
> > Then again, maybe we're missing an up_read() somewhere. hrm, I'll check.
> >
>
> It reproduced again. I think this is the one causing all the troubles..
>
> db2fmcd D ffffffff80132e2a 0 10854 7636
> (NOTLB)
> 00000101bae85ef8 0000000000000002 0000020800000018 00000101d9ddd550
> 0000000100000084 000001016d490e20 000001016d491158
> 00000101d9ddd550
> 0000000000000206 ffffffff801353cb
> Call Trace:<ffffffff801353cb>{try_to_wake_up+971}
> <ffffffff804455f0>{__down_write+128}
> <ffffffff80125e6f>{sys32_mmap+143}
> <ffffffff80124af1>{ia32_sysret+0}
>
> when I tried looking at /proc/10854 - it hung.

OK, so maybe that CPU is spinning in try_to_wake_up(). Can you tell if one
CPU is busy when this happens?

Or you could try Peter's suggestion:

--- 25/kernel/sched.c~a 2004-10-01 17:43:32.500700488 -0700
+++ 25-akpm/kernel/sched.c 2004-10-01 17:43:34.754357880 -0700
@@ -1606,7 +1606,7 @@ out_set_cpu:
task_rq_unlock(rql, &flags);
/* might preempt at this point */
rql = task_rq_lock(p, &flags);
- adjust_sched_timestamp(p, old_rq);
+// adjust_sched_timestamp(p, old_rq);
old_state = p->state;
if (!(old_state & state))
goto out;
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/