Re: while_each_thread() under rcu_read_lock() is broken?

From: Paul E. McKenney
Date: Mon Jun 21 2010 - 17:38:29 EST


On Mon, Jun 21, 2010 at 02:22:59PM -0700, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> writes:
>
> > On Mon, Jun 21, 2010 at 07:09:19PM +0200, Oleg Nesterov wrote:
> >> On 06/18, Paul E. McKenney wrote:
> >> >
> >> > On Fri, Jun 18, 2010 at 09:34:03PM +0200, Oleg Nesterov wrote:
> >> > >
> >> > > #define XXX(t) ({
> >> > > struct task_struct *__prev = t;
> >> > > t = next_thread(t);
> >> > > t != g && t != __prev;
> >> > > })
> >> > >
> >> > > #define while_each_thread(g, t) \
> >> > > while (XXX(t))
> >> >
> >> > Isn't the above vulnerable to a pthread_create() immediately following
> >> > the offending exec()? Especially if the task doing the traversal is
> >> > preempted?
> >>
> >> Yes, thanks!
> >>
> >> > here are some techniques that might (or might not) help:
> >>
> >> To simplify, let's consider the concrete example,
> >
> > Sounds very good!
> >
> >> rcu_read_lock();
> >>
> >> g = t = returns_the_rcu_safe_task_struct_ptr();
> >
> > This returns a pointer to the task struct of the current thread?
> > Or might this return a pointer some other thread's task struct?
> >
> >> do {
> >> printk("%d\n", t->pid);
> >> } while_each_thread(g, t);
> >>
> >> rcu_read_unlock();
> >>
> >> Whatever we do, without tasklist/siglock this can obviously race
> >> with fork/exit/exec. It is OK to miss a thread, or print the pid
> >> of the already exited/released task.
> >>
> >> But it should not loop forever (the subject), and it should not
> >> print the same pid twice (ignoring pid reuse, of course).
> >>
> >> And, afaics, there are no problems with rcu magic per se, next_thread()
> >> always returns the task_struct we can safely dereference. The only
> >> problem is that while_each_thread() assumes that sooner or later
> >> next_thread() must reach the starting point, g.
> >>
> >> (zap_threads() is different, it must not miss a thread with ->mm
> >> we are going to dump, but it holds mmap_sem).
> >
> > Indeed, the tough part is figuring out when you are done given that things
> > can come and go at will. Some additional tricks, in no particular order:
> >
> > 1. Always start at the group leader. Of course, the group leader
> > is probably permitted to leave any time it wants to, so this
> > is not sufficient in and of itself.
>
> No. The group leader must exist as long as the group exists.
> Modulo de_thread weirdness. The group_leader can be a zombie but
> it can not go away completely.

Ah, OK -- now that you mention it, all the thinks that I can think of
that remove a thread from a group have the side effect of destroying
the old group (exec() and exit()). Other things that create a new thread
group leave the old thread group intact.

Or am I forgetting some odd operation?

> > 2. Maintain a separate task structure that flags the head of the
> > list. This separate structure is freed one RCU grace period
> > following the disappearance of the current group leader. This
> > should be quite robust, but "holy overhead, Batman!!!" (Apologies
> > for the American pop culture reference, but nothing else seemed
> > appropriate.)
>
> That is roughly what we have in the group leader right now.

But can't the group leader do an exec(), becoming the leader of a new
thread group without waiting for a grace period? Or this possibility
already covered?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/