Re: [PATCH v2 2/5] sched,ptrace: Fix ptrace_check_attach() vs PREEMPT_RT

From: Eric W. Biederman
Date: Tue Apr 26 2022 - 20:24:21 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 04/21, Peter Zijlstra wrote:
>>
>> @@ -2225,7 +2238,7 @@ static int ptrace_stop(int exit_code, in
>> * schedule() will not sleep if there is a pending signal that
>> * can awaken the task.
>> */
>> - current->jobctl |= JOBCTL_TRACED;
>> + current->jobctl |= JOBCTL_TRACED | JOBCTL_TRACED_QUIESCE;
>> set_special_state(TASK_TRACED);
>
> OK, this looks wrong. I actually mean the previous patch which sets
> JOBCTL_TRACED.
>
> The problem is that the tracee can be already killed, so that
> fatal_signal_pending(current) is true. In this case we can't rely on
> signal_wake_up_state() which should clear JOBCTL_TRACED, or the
> callers of ptrace_signal_wake_up/etc which clear this flag by hand.
>
> In this case schedule() won't block and ptrace_stop() will leak
> JOBCTL_TRACED. Unless I missed something.
>
> We could check fatal_signal_pending() and damn! this is what I think
> ptrace_stop() should have done from the very beginning. But for now
> I'd suggest to simply clear this flag before return, along with
> DELAY_WAKEKILL and LISTENING.

Oh. That is an interesting case for JOBCTL_TRACED. The
scheduler refuses to stop if signal_pending_state(TASK_TRACED, p)
returns true.

The ptrace_stop code used to handle this explicitly and in commit
7d613f9f72ec ("signal: Remove the bogus sigkill_pending in ptrace_stop")
I actually removed the test. As the test was somewhat wrong and
redundant, and in slightly the wrong location.

But doing:

/* Don't stop if the task is dying */
if (unlikely(__fatal_signal_pending(current)))
return exit_code;

Should work.

>
>> current->jobctl &= ~JOBCTL_LISTENING;
>> + current->jobctl &= ~JOBCTL_DELAY_WAKEKILL;
>
> current->jobctl &=
> ~(~JOBCTL_TRACED | JOBCTL_DELAY_WAKEKILL | JOBCTL_LISTENING);


I presume you meant:

current->jobctl &=
~(JOBCTL_TRACED | JOBCTL_DELAY_WAKEKILL | JOBCTL_LISTENING);

I don't think we want to do that. For the case you are worried about it
is a valid fix.

In general this is the wrong approach as we want the waker to clear
JOBCTL_TRACED. If the waker does not it is possible that
ptrace_freeze_traced might attempt to freeze a process whose state
is not appropriate for attach, because the code is past the call
to schedule().

In fact I think clearing JOBCTL_TRACED at the end of ptrace_stop
will allow ptrace_freeze_traced to come in while siglock is dropped,
expect the process to stop, and have the process not stop. Of
course wait_task_inactive coming first that might not be a problem.



This is a minor problem with the patchset I just posted. I thought the
only reason wait_task_inactive could fail was if ptrace_stop() hit the
!current->ptrace case. Thinking about any it any SIGKILL coming in
before tracee stops in schedule will trigger this, so it is not as
safe as I thought to not pass a state into wait_task_inactive.

It is time for me to shut down today. I will sleep on that and
see what I can see tomorrow.

Eric