Re: ptrace() hangs on attempt to seize/attach stopped & frozen task

From: Oleg Nesterov
Date: Tue Nov 17 2015 - 13:38:30 EST


On 11/16, Tejun Heo wrote:
>
> *** WARNING: THE ATTACHED DOCUMENT(S) CONTAIN MACROS ***
> *** MACROS MAY CONTAIN MALICIOUS CODE ***
> *** Open only if you can verify and trust the sender ***
> *** Please contact infosec@xxxxxxxxxx if you have questions or concerns **

Hmm, infosec@xxxxxxxxxx doesn't like you. But I dared to open and nothing
happened so far. although perhaps you already own my machine.

> > And perhaps we can simply remove this logic? I forgot why do we hide this
> > STOPPED -> RUNNING -> TRACED transition from the attaching thread. But the
> > vague feeling tells me that we discussed this before and perhaps it was me
> > who suggested to avoid the user-visible change when you introduced this
> > transition...
>
> Heh, it was too long ago for me to remember much. :)

Same here...

> > Anyway, now I do not understand why do we want to hide it. Lets consider
> > the following "test-case",
> >
> > void test(int pid)
> > {
> > kill(pid, SIGSTOP);
> > waitpid(pid, NULL, WSTOPPED);
> >
> > ptrace(PTRACE_ATTACH-OR-PTRACE_SEIZE, pid, 0,0);
> >
> > assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
> > }
> >
> > Yes, it will fail if we remove JOBCTL_TRAPPING. But it can equally fail
> > if SIGCONT comes before ATTACH, so perhaps we do not really care?
> >
> > Jan, Pedro, do you think the patch below can break gdb somehow? With this
> > patch you can never assume that waitpid(WNOHANG) or ptrace(WHATEVER) will
> > succeed right after PTRACE_ATTACH/PTRACE_SEIZE, even if you know that the
> > tracee was TASK_STOPPED before attach.
> >
> > Tejun, do you see any reason to keep JOBCTL_TRAPPING?
>
> Hmmm... It's nasty tho. We're breaking a guaranteed userland behavior

Perhaps you are right, but I am wondering if it was ever guaranteed.

What actually annoys me is that now I am almost sure that it was me
who asked you to hide this from user-space, and today I see no reason
for this hack.

> I'd be a lot more comfortable stating
> that cgroup freezer is currently broken rather than diddling with
> subtle ptrace semantics.

OK, lets keep this JOBCTL_TRAPPING_BIT.

But still I would like to know what Pedro thinks...

Anyway, wait_on_bit(TASK_UNINTERRUPTIBLE) doesn't look good. Do you
see any problem with the change below? Yes, the comment is not clear,
it should be updated, the tracee can clear this bit too.

And perhaps we can change get_task_state() until freezer gets another state,

--- x/fs/proc/array.c
+++ x/fs/proc/array.c
@@ -126,6 +126,9 @@ static inline const char *get_task_state
{
unsigned int state = (tsk->state | tsk->exit_state) & TASK_REPORT;

+ if (tsk->flags & PF_FROZEN)
+ return "D (frozen)";
+
BUILD_BUG_ON(1 + ilog2(TASK_REPORT) != ARRAY_SIZE(task_state_array)-1);

return task_state_array[fls(state)];

?

Oleg.

--- x/kernel/ptrace.c
+++ x/kernel/ptrace.c
@@ -364,8 +364,13 @@ unlock_creds:
mutex_unlock(&task->signal->cred_guard_mutex);
out:
if (!retval) {
- wait_on_bit(&task->jobctl, JOBCTL_TRAPPING_BIT,
- TASK_UNINTERRUPTIBLE);
+ if (wait_on_bit(&task->jobctl, JOBCTL_TRAPPING_BIT,
+ TASK_KILLABLE))
+ /*
+ * We will clear JOBCTL_TRAPPING in __ptrace_unlink(),
+ * until then nobody can trace this task anyway.
+ */
+ retval = -EINTR;
proc_ptrace_connector(task, PTRACE_ATTACH);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/