Re: [PATCH/RFC] Futex mmap_sem deadlock

From: Jamie Lokier
Date: Tue Feb 22 2005 - 16:10:04 EST


Andrew Morton wrote:
> > This will quickly lock up, since the futex_wait code dows a
> > down_read(mmap_sem), then a get_user().
> >
> > The do_page_fault code on ppc64 (as well as other architectures) needs
> > to take the same semaphore for reading. This is all good until the
> > second thread comes into play: Its mmap call tries to take the same
> > semaphore for writing which causes in the do_page_fault down_read()
> > to get stuck. Classic deadlock.
>
> Yup. Jamie says that the futex code _has_ to hold mmap_sem across the
> get_user(). I forget (but could probably locate) the details.

It does - the "key" which identifies a futex depends on a vma
calculation, and the vma must not change between the calculation and
the get_user().

> > One attempt to fix this is included below. It works, but I'm not entirely
> > happy with the fact that it's a bit messy solution. If anyone has a
> > better idea for how to solve it I'd be all ears.
>
> It's fairly sane. Style-wise I'd be inclined to turn this:
>
> down_read(&current->mm->mmap_sem);
> while (!check_user_page_readable(current->mm, uaddr1)) {
> up_read(&current->mm->mmap_sem);
> /* Fault in the page through get_user() but discard result */
> if (get_user(curval, (int __user *)uaddr1) != 0)
> return -EFAULT;
> down_read(&current->mm->mmap_sem);
> }

That won't work because the vma lock must be help between key
calculation and get_user() - otherwise futex is not reliable. It
would work if the futex key calculation was inside the loop.

A much simpler solution (and sorry for not offering it earlier,
because Andrew Morton pointed out this bug long ago, but I was busy), is:

In futex.c:

down_read(&current->mm->mmap_sem);
get_futex_key(...) etc.
queue_me(...) etc.
current->flags |= PF_MMAP_SEM; <- new
ret = get_user(...);
current->flags &= PF_MMAP_SEM; <- new
/* the rest */

And in arch/*/mm/fault.c, replace every one of these:

down_read(&mm->mmap_sem);

up_read(&mm->mmap_sem);

with these:

if (!(current & PF_MMAP_SEM))
down_read(&mm->mmap_sem);

if (!(current & PF_MMAP_SEM))
up_read(&mm->mmap_sem);

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/