Re: process 'stuck' at exit.

From: Oleg Nesterov
Date: Sat Dec 14 2013 - 15:17:29 EST


On 12/10, Dave Jones wrote:
>
> On Tue, Dec 10, 2013 at 07:23:30PM -0500, Dave Jones wrote:
>
> > I was distracted by seeing all the other threads exiting, so I was only looking at
> > what this one had already done.
>
> another thing that distracted me was that /proc/10818/stack was just showing that
>
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> output.
>
> For my own education, what causes that ?

save_stack_trace_tsk() adds ULONG_MAX as the "last" entry.

and dump_trace() fails if task is running and != current (note that
cat /proc/self/stack works).

> How come it didn't show the same trace I saw when I sysrq-t'd ?

Because print_trace_address() does not skip !reliable entries,
unlike __save_stack_address(). This (afaics) makes the difference.

I'll try to make a patch but I am not sure... I am not even sure
it makes sense, but in any case this all doesn't look right to me.

First of all, stack = task->thread.sp is not really right if this
task is running. Worse, bp = *stack returned by stack_frame() is
random in this case. This equally applies to sysrq-t's output.
Not that bad, but still wrong and confusing, imho.

And lets look at dump_trace(),

const unsigned cpu = get_cpu();
unsigned long *irq_stack_end =
(unsigned long *)per_cpu(irq_stack_ptr, cpu);

This (in general) has nothing to do with task_cpu(task).

And why dump_trace() checks irq_stack_end != NULL ? This is always true.

I think that these paths should not even try to guess what bp is
if the task is not running/current. But it is not clear to "disable"
reliable check in __save_stack_address(), we should report this fact
in proc_pid_stack()->seq_printf() somehow.

And proc_pid_stack() should drop ->cred_guard_mutex right after
save_stack_trace_tsk(), although this is off-topic.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/