Re: 2.6.24 regression: pan hanging unkilleable and un-straceable

From: Nick Piggin
Date: Tue Jan 22 2008 - 00:26:33 EST


On Tuesday 22 January 2008 16:03, Mike Galbraith wrote:
> On Tue, 2008-01-22 at 11:05 +1100, Nick Piggin wrote:
> > On Tuesday 22 January 2008 07:58, Frederik Himpe wrote:
> > > With Linux 2.6.24-rc8 I often have the problem that the pan usenet
> > > reader starts using 100% of CPU time after some time. When this
> > > happens, kill -9 does not work, and strace just hangs when trying to
> > > attach to the process. The same with gdb. ïps shows the process as
> > > being in the R state.
> > >
> > > I pressed Ctrl-Alt-SysRq-T, and this was shown for pan:
> > > Jan 21 21:45:01 Anastacia kernel: pan R running task
> > > 0
> >
> > Well I've twice tried to submit a patch to print stacks for running
> > tasks as well, but nobody seems interested. It would at least give a
> > chance to see something.
>
> I've hit same twice recently (not pan, and not repeatable).

Nasty. The attached patch is something really simple that can sometimes help.
sysrq+p is also an option, if you're on a UP system.

Any luck getting traces?

Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -4920,8 +4920,7 @@ static void show_task(struct task_struct
printk(KERN_CONT "%5lu %5d %6d\n", free,
task_pid_nr(p), task_pid_nr(p->real_parent));

- if (state != TASK_RUNNING)
- show_stack(p, NULL);
+ show_stack(p, NULL);
}

void show_state_filter(unsigned long state_filter)