Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI

From: Peter Zijlstra
Date: Thu Nov 24 2016 - 13:58:21 EST


On Thu, Nov 24, 2016 at 06:56:53PM +0100, Thomas Gleixner wrote:
> > I'm stumped on REQUEUE_PI.. this relies on attach_to_pi_owner() and
>
> You mean LOCK_PI, right?
>
> > fixup_owner() being in the same function. But this is not the case for
> > requeue. WAIT_REQUEUE has the fixup, as its return path finds it has
> > acquired the outer pi-futex (uaddr2), but the lookup_pi_state() stuff is
> > done by CMP_REQUEUE, which does the actual transfer of the waiters from
> > inner futex (uaddr1) to outer futex (uaddr2).
>
> Correct. WAIT_REQUEUE puts the futex on the inner (uaddr1) and then gets
> moved to the outer. From there it's the same thing as LOCK_PI.
>
> > Maybe I can restructure things a bit, I think CMP_REQUEUE would also
> > know who actually acquired the outer-futex, but I have to think more on
> > this and the brain is pretty fried...
>
> That is irrelevant at requeue time and the owner can change up to the point
> where the waiter is really woken by a UNLOCK_PI.

OK, so clearly I'm confused. So let me try again.

LOCK_PI, does in one function: lookup_pi_state, and fixup_owner. If
fixup_owner fails with -EAGAIN, we can redo the pi_state lookup.

The requeue stuff, otoh, has one each. REQUEUE_WAIT has fixup_owner(),
CMP_REQUEUE has lookup_pi_state. Therefore, fixup_owner failing with
-EAGAIN leaves us dead in the water. There's nothing to go back to to
retry.

So far, so 'good', right?

Now, as far as I understand this requeue stuff, we have 2 futexes, an
inner futex and an outer futex. The inner futex is always 'locked' and
serves as a collection pool for waiting threads.

The requeue crap picks one (or more) waiters from the inner futex and
sticks them on the outer futex, which gives them a chance to run.

So WAIT_REQUEUE blocks on the inner futex, but knows that if it ever
gets woken, it will be on the outer futex, and hence needs to
fixup_owner if the futex and rt_mutex state got out of sync.

CMP_REQUEUEUEUE picks the one (or more) waiters of the inner futex and
sticks them on the outer futex.

So far, so 'good' ?

The thing I'm not entire sure on is what happens with the outer futex,
do we first LOCK_PI it before doing CMP_REQUEUE, giving us waiters, and
then UNLOCK_PI to let them rip? Or do we just CMP_REQUEUE and then let
whoever wins finish with UNLOCK_PI?


In any case, I don't think it matters much, either way we can race
betwen the 'last' UNLOCK_PI and getting rt_mutex waiters and then hit
the &init_task funny state, such that WAIT_REQUEUE waking hits EAGAIN
and we're 'stuck'.

Now, if we always CMP_REQUEUE to a locked outer futex, then we cannot
know, at CMP_REQUEUE time, who will win and cannot fix up.

The only solution I've come up with so far involves that
rt_mutex_proxy_swizzle() muck which you didn't really fancy much.