Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier

From: Mike Galbraith
Date: Sat Oct 18 2014 - 03:09:26 EST


(fixes Davidlohr bounce)

On Sat, 2014-10-18 at 08:54 +0200, Mike Galbraith wrote:
> On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote:
> > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> > nothing to wake up) changes the futex code to avoid taking a lock when
> > there are no waiters. This code has been subsequently fixed in commit
> > 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> > Both the original commit and the fix-up rely on get_futex_key_refs() to
> > always imply a barrier.
> >
> > However, for private futexes, none of the cases in the switch statement
> > of get_futex_key_refs() would be hit and the function completes without
> > a memory barrier as required before checking the "waiters" in
> > futex_wake() -> hb_waiters_pending(). The consequence is a race with a
> > thread waiting on a futex on another CPU, allowing the waker thread to
> > read "waiters == 0" while the waiter thread to have read "futex_val ==
> > locked" (in kernel).
> >
> > Without this fix, the problem (user space deadlocks) can be seen with
> > Android bionic's mutex implementation on an arm64 multi-cluster system.
>
> How 'bout that, you just triggered my "watch this pot" alarm.
>
> https://lkml.org/lkml/2014/10/8/406
>
> The hang I encountered with stockfish only ever happened on one specific
> box. Linus/Thomas said it I was likely a problem with the futex usage,
> but it suspiciously deterministic, so I put this on the "watch out for
> further evidence" back burner.
>
> The barrier fixing up my problematic box smells a lot like evidence.
>
> > Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Reported-by: Matteo Franchin <Matteo.Franchin@xxxxxxx>
> > Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
> > Cc: <stable@xxxxxxxxxxxxxxx>
> > Cc: Davidlohr Bueso <davidlohr@xxxxxx>
> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Darren Hart <dvhart@xxxxxxxxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > ---
> > kernel/futex.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index 815d7af2ffe8..f3a3a071283c 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
> > case FUT_OFF_MMSHARED:
> > futex_get_mm(key); /* implies MB (B) */
> > break;
> > + default:
> > + smp_mb(); /* explicit MB (B) */
> > }
> > }
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/