Re: strace lockup when tracing exec in go

From: Aleksa Sarai
Date: Thu Sep 22 2016 - 04:21:07 EST


So I've stared into do_notify_parent some more and the following was
just very confusing

if (!tsk->ptrace && sig == SIGCHLD &&
(psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
(psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
/*
* We are exiting and our parent doesn't care. POSIX.1
* defines special semantics for setting SIGCHLD to SIG_IGN
* or setting the SA_NOCLDWAIT flag: we should be reaped
* automatically and not left for our parent's wait4 call.
* Rather than having the parent do it as a magic kind of
* signal handler, we just set this to tell do_exit that we
* can be cleaned up without becoming a zombie. Note that
* we still call __wake_up_parent in this case, because a
* blocked sys_wait4 might now return -ECHILD.
*
* Whether we send SIGCHLD or not for SA_NOCLDWAIT
* is implementation-defined: we do (if you don't want
* it, just use SIG_IGN instead).
*/
autoreap = true;
if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
sig = 0;
}

it tries to prevent from what I am seeing in a way. If the SIGCHLD is
ignored then it just does autoreap and everything is fine. But this
doesn't seem to be the case here. In fact we are not sending the signal
because sig_task_ignored is true resp. sig_handler_ignored which can
fail even for handler == SIG_DFL && sig_kernel_ignore() and SIGCHLD
seems to be in SIG_KERNEL_IGNORE_MASK. So I've tried

I was looking at the same code this morning. I thought maybe we should drop the !tsk->ptrace condition (or make it so that the condition still succeeds if the tracer also happens to be tsk->real_parent) -- since this is only happening when the process is being traced? I tried this and the issue still persists, but I didn't apply your other proposed change to this conditional.

Or am I misunderstanding what tsk->ptrace refers to?

--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/