Re: [PATCH] fix exit_itimers() vs posix_timer_event() AB-BAdeadlock

From: Oleg Nesterov
Date: Sun Sep 25 2005 - 08:55:56 EST


Andrew Morton wrote:
>
> Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > + /*
> > + * We are locking ->it_lock + tasklist_lock backwards
> > + * from release_task()->exit_itimers(), beware deadlock.
> > + */
> > + leader = timr->it_process->group_leader;
> > + while (unlikely(!read_trylock(&tasklist_lock))) {
> > + if (leader->flags & PF_EXITING) {
> > + smp_rmb();
> > + if (thread_group_empty(leader))
> > + return 0;
> > + }
> > + cpu_relax();
> > + }
>
> Oh dear. Is there no way to fix this up by taking the locks in the correct
> order? (Whatever that is).

Yes, this is ugly and that is why I hesitated so long before posting this patch.

But I don't see the simple and correct fix. The lock ordering is not the only
problem. In fact we don't need to take the ->it_lock in the itimer_delete() at
all. The problem is that itimer_delete() can't delete k_itimer.it.real.timer
while holding tasklist_lock, because this lock will prevent (without this patch)
the completition of posix_timer_fn(). It is unsafe to unlock tasklist in __exit_signal()
before calling exit_itimers(), and it is too late to call exit_itimers() after
unlock_irq(&tasklist_lock) in release_task(), at this moment we don't have
->signal/->sighand already.

I beleive the only sane fix would be to eliminate tasklist_lock from signal
sending path (see the patches from Paul and Thomas), but this is hard and
(I think) needs more working/testing.

Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/