Re: [PATCH] fix kill_proc_info() vs copy_process() race

From: Oleg Nesterov
Date: Wed Feb 15 2006 - 07:46:04 EST


Paul E. McKenney wrote:
>
> On Mon, Feb 06, 2006 at 07:45:48PM +0300, Oleg Nesterov wrote:
> > The first race is simple, copy_process:
> >
> > /*
> > * Check for pending SIGKILL! The new thread should not be allowed
> > * to slip out of an OOM kill. (or normal SIGKILL.)
> > */
> > if (sigismember(&current->pending.signal, SIGKILL))
> > return EINTR;
> >
> > This relies on tasklist_lock and is racy now.
>
> Agreed, but is the race any worse than it was before? Since SIGKILL is
> fatal, the bit can be set but never cleared. My belief, quite possibly
> mistaken, is that this is a performance issue rather than a correctness
> issue -- we would like to avoid the overhead of a fork() for a "walking
> dead" process.

My apologies, I was very unclear. I talked about this check because I tried
to unify it with 'if (SIGNAL_GROUP_EXIT)' below. Let me try again.

copy_process(CLONE_THREAD) __group_complete_signal(SIGKILL)

lock(->sighand);
if (->signal->flags & SIGNAL_GROUP_EXIT) // NO
...abort forking...
unlock(->sighand)

->signal->flags = SIGNAL_GROUP_EXIT;
// does not see the new thread yet
for_each_thread_in_thread_group(t) {
sigaddset(t->pending, SIGKILL);
signal_wake_up(t);
}

... finish clone ...


The new thread starts without TIF_SIGPENDING. When any of other threads calls
get_signal_to_deliver() it will notice SIGKILL and call do_group_exit(), which
does:

if (SIGNAL_GROUP_EXIT) // Yes, was set in group_complete_signal()
// don't call zap_other_threads()
do_exit();

So, thread group missed SIGKILL. The new thread runs with SIGNAL_GROUP_EXIT set
and has SIGKILL in ->shared_pending, so it can't be killed via sys_kill(SIGKILL),
and it can't be stopped.

This is not fatal, we can kill this thread via tkill(), even if it blocked other
signals, but still this is a bug (if I am right).

> > The second race is more tricky, copy_process:
> >
> > attach_pid(p, PIDTYPE_PID, p->pid);
> > attach_pid(p, PIDTYPE_TGID, p->tgid);
> >
> > This means that we can find a task in kill_proc_info()->find_task_by_pid()
> > which is not registered as part of thread group yet. Various bad things can
> > happen, note that handle_stop_signal(SIGCONT) and __group_complete_signal()
> > iterate over threads list. But p->pids[PIDTYPE_TGID] is a copy of current's
> > 'struct pid' from dup_task_struct(), and if we don't have CLONE_THREAD here
> > we will use completely unreleated (parent's) thread list.
>
> But I could easily be missing something, still a bit jetlagged. Could
> you please lay out the exact sequence of events in the scenario that you
> are thinking of?

Let's suppose that process with pid == 1000 does fork (no CLONE_THREAD bit),
and a bad boy does sys_kill(1001, SIGXXX)

copy_process:

// it is possible that p->pid == 1001
attach_pid(p, PIDTYPE_PID, p->pid);


kill_proc_info:

p = find_task_by_pid(1001); // Found!

__group_complete_signal(p):

// iterate over thread group
do {
...
} while (next_thread(t) != p)

The (one of) problem is that this loop never stops: next_thread() will iterate
over parent's threads list, because p have a copy of the parent's pids[PIDTYPE_TGID],
and p is not a member of this thread group. Unless I missed something, we have
an endless loop with interrupts disabled.

> And if there is a real problem, is it possible to fix it by changing
> the order of the attach_pid() calls?

I think yes, and I did exactly that in my next attempt to fix this problem.

Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/