Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH

From: Denys Vlasenko
Date: Mon Feb 14 2011 - 06:40:11 EST


On Monday 14 February 2011 10:03, Jan Kratochvil wrote:
> On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> > * sleep runs in nanosleep
> > * SIGSTOP arrives, strace sees it
> > * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> > * sleep process enters group-stop
>
> The last point breaks the documented behavior of ptrace:
> If data is nonzero and not SIGSTOP, it is interpreted as a signal to
> be delivered to the child; otherwise, no signal is delivered.

But SIGSTOP _is_ delivered - that's why sleep process stops.

> I do not see it would affect gdb. strace will change its behavior when
> SIGSTOP is sent to its tracee although the new behavior may be OK.
>
> It is more a subject of apps compatibility testing with such a kernel change.
>
>
> > * nothing happens until some other signal arrives
> > * say, SIGCONT arrives
>
> What if other signal arrives? The tracer probably should not be notified as
> the tracee is in a group-stop.

The behavior here ideally should be the same as for non-traced process:
the signals are remembered while process is stopped, and it sees them
only after SIGCONT, as demonstrated by the following program

#include <errno.h>
#include <string.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
static void sig(int n)
{
char buf[128];
int e = errno;
sprintf(buf, "sig: %d %s\n", n, strsignal(n));
write(1, buf, strlen(buf));
errno = e;
}
int main()
{
signal(SIGSTOP, sig);
signal(SIGCONT, sig);
signal(SIGWINCH, sig);
signal(SIGABRT, sig);
again:
printf("PID: %d\n", getpid());
fflush(NULL);
errno = 0;
sleep(30);
int e = errno;
printf("after sleep: errno=%d %s\n", e, strerror(e));
if (e) goto again;
return 0;
}

# ./a.out
PID: 16382
<------ kill -STOP 16382
<------ kill -ABRT 16382
<------ kill -WINCH 16382
<------ kill -CONT 16382
sig: 28 Window changed
sig: 18 Continued
sig: 6 Aborted
after sleep: errno=4 Interrupted system call
PID: 16382


I believe it would be best if debugger sees signals immediately,
but when it does ptrace(PTRACE_CONT/SYSCALL, ..., <sig>)
in order to send signals to group-stopped tracee, they are queued
to it without terminating group-stop. When SIGCONT arrives,
ptrace(PTRACE_CONT/SYSCALL, ..., SIGCONT) terminates group-stop
and causes all queued signals to be handled (in random order,
not necessarily in the order of arrival. Even CONT handler is
not guaranteed to be called first, as you see above).

--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/