Re: [PATCH v4 12/12] sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

From: Alexander Gordeev
Date: Tue Jun 21 2022 - 09:02:02 EST


On Thu, May 05, 2022 at 01:26:45PM -0500, Eric W. Biederman wrote:
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>
> Currently ptrace_stop() / do_signal_stop() rely on the special states
> TASK_TRACED and TASK_STOPPED resp. to keep unique state. That is, this
> state exists only in task->__state and nowhere else.
>
> There's two spots of bother with this:
>
> - PREEMPT_RT has task->saved_state which complicates matters,
> meaning task_is_{traced,stopped}() needs to check an additional
> variable.
>
> - An alternative freezer implementation that itself relies on a
> special TASK state would loose TASK_TRACED/TASK_STOPPED and will
> result in misbehaviour.
>
> As such, add additional state to task->jobctl to track this state
> outside of task->__state.
>
> NOTE: this doesn't actually fix anything yet, just adds extra state.
>
> --EWB
> * didn't add a unnecessary newline in signal.h
> * Update t->jobctl in signal_wake_up and ptrace_signal_wake_up
> instead of in signal_wake_up_state. This prevents the clearing
> of TASK_STOPPED and TASK_TRACED from getting lost.
> * Added warnings if JOBCTL_STOPPED or JOBCTL_TRACED are not cleared

Hi Eric, Peter,

On s390 this patch triggers warning at kernel/ptrace.c:272 when
kill_child testcase from strace tool is repeatedly used (the source
is attached for reference):

while :; do
strace -f -qq -e signal=none -e trace=sched_yield,/kill ./kill_child
done

It normally takes few minutes to cause the warning in -rc3, but FWIW
it hits almost immediately for ptrace_stop-cleanup-for-v5.19 tag of
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.

Commit 7b0fe1367ef2 ("ptrace: Document that wait_task_inactive can't
fail") suggests this WARN_ON_ONCE() is not really expected, yet we
observe a child in __TASK_TRACED state. Could you please comment here?

Thanks!
/*
* Check for the corner case that previously lead to segfault
* due to an attempt to access unitialised tcp->s_ent.
*
* 13994 ????( <unfinished ...>
* ...
* 13994 <... ???? resumed>) = ?
*
* Copyright (c) 2019 The strace developers.
* All rights reserved.
*
* SPDX-License-Identifier: GPL-2.0-or-later
*/

#include "tests.h"

#include <sched.h>
#include <signal.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>

#define ITERS 10000
#define SC_ITERS 10000

int
main(void)
{
volatile sig_atomic_t *const mem =
mmap(NULL, get_page_size(), PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (mem == MAP_FAILED)
perror_msg_and_fail("mmap");

for (unsigned int i = 0; i < ITERS; ++i) {
mem[0] = mem[1] = 0;

const pid_t pid = fork();
if (pid < 0)
perror_msg_and_fail("fork");

if (!pid) {
/* wait for the parent */
while (!mem[0])
;
/* let the parent know we are running */
mem[1] = 1;

for (unsigned int j = 0; j < SC_ITERS; j++)
sched_yield();

pause();
return 0;
}

/* let the child know we are running */
mem[0] = 1;
/* wait for the child */
while (!mem[1])
;

if (kill(pid, SIGKILL))
perror_msg_and_fail("kill");
if (wait(NULL) != pid)
perror_msg_and_fail("wait");
}

return 0;
}