Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals

From: Paul E. McKenney
Date: Wed Aug 17 2005 - 16:19:41 EST


On Wed, Aug 17, 2005 at 06:35:03PM +0400, Oleg Nesterov wrote:
> Paul E. McKenney wrote:
> >
> > > The other thing that jumped out at me is that signals are very different
> > > animals from a locking viewpoint depending on whether they are:
> > >
> > > 1. ignored,
> > >
> > > 2. caught by a single thread,
> > >
> > > 3. fatal to multiple threads/processes (though I don't know
> > > of anything that shares sighand_struct between separate
> > > processes), or
> > >
> > > 4. otherwise global to multiple threads/processes (such as
> > > SIGSTOP and SIGCONT).
> > >
> > > And there are probably other distinctions that I have not yet caught
> > > on to.
> > >
> > > One way to approach this would be to make your suggested lock_task_sighand()
> > > look at the signal and acquire the appropriate locks. If, having acquired
> > > a given set of locks, it found that the needed set had changed (e.g., due
> > > to racing exec() or sigaction()), then it drops the locks and retries.
> >
> > OK, for this sort of approach to work, lock_task_sighand() must take and
> > return some sort of mask indicating what locks are held. The mask returned
> > by lock_task_sighand() would then be passed to an unlock_task_sighand().
>
> Sorry, I don't understand you. CLONE_THREAD implies CLONE_SIGHAND,
> so we always need to lock one ->sighand. Could you please clarify?

On the #3 and #4 code paths, the code assumes that the task-list lock
is held. So I was thinking of something (very roughly) as follows:

#define SIGLOCK_HOLD_RCU (1 << 0)
#define SIGLOCK_HOLD_TASKLIST (1 << 1)
#define SIGLOCK_HOLD_SIGLOCK (1 << 2)

int lock_task_sighand(int sig, struct task_struct *p, int locksheld, struct sighand_struct **spp, int *flags)
{
int locksret = 0;
struct sighand_struct *sp;

retry:
if (!(locksheld & SIGLOCK_HOLD_RCU)) {
locksret |= SIGLOCK_HOLD_RCU;
rcu_read_lock();
}
sp = rcu_dereference(p->sighand);
if (sp == NULL) {
unlock_task_sighand(NULL, locksret, *flags);
*spp = NULL;
return 0;
}
if (!(locksheld & SIGLOCK_HOLD_TASKLIST)) {
/* Complain if siglock held. */
if (sig_kernel_stop(sig) /* expand for other conditions */) {
locksret |= SIGLOCK_HOLD_TASKLIST;
read_lock(&tasklist_lock);
}
}
if (!(locksheld & SIGLOCK_HOLD_SIGLOCK)) {
*flags = 0;
} else {
locksret |= SIGLOCK_HOLD_SIGLOCK;
spin_lock_irqsave(&sp->siglock, *flags);
if (p->sighand != sp) {
unlock_task_sighand(sp, locksret, *flags);
goto retry;
}
}
*spp = sp;
return locksret;
}

void unlock_task_sighand(struct sighand_struct *sp, int lockstodrop, int flags)
{
/* lockstodrop had better be non-negative! */

if (lockstodrop & SIGLOCK_HOLD_SIGLOCK) {
spin_unlock_irqrestore(&sp->siglock, flags);
}
if (lockstodrop & SIGLOCK_HOLD_TASKLIST) {
read_unlock(&tasklist_lock);
}
if (lockstodrop & SIGLOCK_HOLD_RCU) {
rcu_read_unlock();
}
}

The "expand for other conditions" must also cover signals that cause
coredumps, that kill all threads in a process, or that otherwise affect
more than one thread (e.g., kill with a negative signal number). I am
assuming that your argument about not needing the get_task_struct_rcu()
eventually work their way through my skull. ;-)

Of course, the "expand for other conditions" requires reference to the
sighand_struct. And for that, you really need to be holding siglock.
But you have to drop siglock to acquire tasklist_lock. So will end up
relying on RCU some more to safely peek into sighand_struct -before-
acquiring siglock -- which is why I snapshot p->sighand so early, it
will need to be referenced when checking "other conditions".

I am not exactly happy with the above pair of functions, but I suspect
that they might -really- simplify things a bit.

The usage would be as follows:

if ((lockstodrop = lock_task_sighand(sig, tsk, 0, &sp, &flags)) < 0)
return lockstodrop; /* error code */
if (sp != NULL) {
/* invoke the function to send the signal */
}
unlock_task_sighand(sp, lockstodrop, flags);

Thoughts? Hey, you asked for this!!! ;-)

> > > > + if (!ret && sig && (sp = p->sighand)) {
> > > > if (!get_task_struct_rcu(p)) {
> > > > return -ESRCH;
> > > > }
> > > > - spin_lock_irqsave(&p->sighand->siglock, flags);
> > > > + spin_lock_irqsave(&sp->siglock, flags);
> > > > + if (p->sighand != sp) {
> > > > + spin_unlock_irqrestore(&sp->siglock, flags);
> > > > + put_task_struct(p);
> > > > + goto retry;
> > > > + }
> > > > ret = __group_send_sig_info(sig, info, p);
> > > > - spin_unlock_irqrestore(&p->sighand->siglock, flags);
> > > > + spin_unlock_irqrestore(&sp->siglock, flags);
> > > > put_task_struct(p);
> > >
> > > Do we really need get_task_struct_rcu/put_task_struct here?
> > >
> > > The task_struct can't go away under us, it is rcu protected.
> > > When ->sighand is locked, and it is still the same after
> > > the re-check, it means that 'p' has not done __exit_signal()
> > > yet, so it is safe to send the signal.
> > >
> > > And if the task has ->usage == 0, it means that it also has
> > > ->sighand == NULL, and your code will notice that.
> > >
> > > No?
> >
> > Seems plausible. I got paranoid after seeing the lock dropped in
> > handle_stop_signal(), though.
>
> Yes, this is bad and should be fixed, I agree.
>
> But why do you think we need to bump task->usage? It can't make any
> difference, afaics. The task_struct can't dissapear, the caller was
> converted to use rcu_read_lock() or it holds tasklist_lock.
>
> Nonzero task_struct->usage can't stop do_exit or sys_wait4, it will
> only postpone call_rcu(__put_task_struct_cb).
>
> And after we locked ->sighand we have sufficient memory barrier, so
> if we read the stale value into 'sp' we will notice that (if you were
> worried about this issue).
>
> Am I missed something?

I doubt if you are missing anything. I was just being paranoid.
Seeing locks get momentarily get dropped in the middle of functions is
a -really- good way to get me into that state! But since that code
path can be called with tasklist_lock held, it should not be sleeping,
so I believe that you are correct.

Hence my switching to rcu_read_lock() in lock_task_sighand() above.

> > void exit_sighand(struct task_struct *tsk)
> > {
> > write_lock_irq(&tasklist_lock);
> > - __exit_sighand(tsk);
> > + spin_lock(&tsk->sighand->siglock);
> > + if (tsk->sighand != NULL) {
> > + __exit_sighand(tsk);
> > + }
> > + spin_unlock(&tsk->sighand->siglock);
> > write_unlock_irq(&tasklist_lock);
> > }
>
> Very strange code. Why this check? And what happens with
>
> spin_unlock(&tsk->sighand->siglock);
>
> when tsk->sighand == NULL ?

My bad. Ingo beat you to it though. ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/