Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals

From: Oleg Nesterov
Date: Tue Aug 16 2005 - 06:46:00 EST


Paul E. McKenney wrote:
>
> OK, the attached instead revalidates that the task struct still references
> the sighand_struct after obtaining the lock

Personally I think this is a way to go. A nitpick suggestion,
could you make a separate function (say, lock_task_sighand)
which does all this job?

> > and there are some remaining problems
> > that I need to sort out, including:
> ...
>
> o Some of the functions invoked by __group_send_sig_info(),
> including handle_stop_signal(), momentarily drop ->siglock.

Just to be sure that one point doesn't escape your attention, this:

> +++ linux-2.6.13-rc4-realtime-preempt-V0.7.53-01-tasklistRCU/kernel/signal.c 2005-08-14 19:53:28.000000000 -0700
> @@ -328,9 +328,11 @@ void __exit_sighand(struct task_struct *
> struct sighand_struct * sighand = tsk->sighand;
>
> /* Ok, we're done with the signal handlers */
> + spin_lock(&sighand->siglock);
> tsk->sighand = NULL;
> if (atomic_dec_and_test(&sighand->count))
> - kmem_cache_free(sighand_cachep, sighand);
> + sighand_free(sighand);
> + spin_unlock(&sighand->siglock);

is not enough (and unneeded). Unless I missed something, we have
a race:

release_task:

__exit_signal:
spin_lock(sighand);
spin_unlock(sighand);
flush_sigqueue(&sig->shared_pending);
kmem_cache_free(tsk->signal);
// here comes group_send_sig_info(), locks ->sighand,
// delivers the signal to the ->shared_pending.
// siginfo leaked, or crash.
__exit_sighand:
spin_lock(sighand);
tsk->sighand = NULL;
// too late !!!!

I think that release_task() should not use __exit_sighand()
at all. Instead, __exit_signal() should set tsk->sighand = NULL
under ->sighand->lock.

> int group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
> {
> unsigned long flags;
> + struct sighand_struct *sp;
> int ret;
>
> +retry:
> ret = check_kill_permission(sig, info, p);
> - if (!ret && sig && p->sighand) {
> + if (!ret && sig && (sp = p->sighand)) {
> if (!get_task_struct_rcu(p)) {
> return -ESRCH;
> }
> - spin_lock_irqsave(&p->sighand->siglock, flags);
> + spin_lock_irqsave(&sp->siglock, flags);
> + if (p->sighand != sp) {
> + spin_unlock_irqrestore(&sp->siglock, flags);
> + put_task_struct(p);
> + goto retry;
> + }
> ret = __group_send_sig_info(sig, info, p);
> - spin_unlock_irqrestore(&p->sighand->siglock, flags);
> + spin_unlock_irqrestore(&sp->siglock, flags);
> put_task_struct(p);

Do we really need get_task_struct_rcu/put_task_struct here?

The task_struct can't go away under us, it is rcu protected.
When ->sighand is locked, and it is still the same after
the re-check, it means that 'p' has not done __exit_signal()
yet, so it is safe to send the signal.

And if the task has ->usage == 0, it means that it also has
->sighand == NULL, and your code will notice that.

No?

Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/