Re: [PATCH -v2 1/9] rtmutex: Deboost before waking up the top waiter

From: Thomas Gleixner
Date: Thu Sep 29 2016 - 10:46:46 EST


On Mon, 26 Sep 2016, Peter Zijlstra wrote:

> On Mon, Sep 26, 2016 at 11:37:27AM -0400, Steven Rostedt wrote:
> > On Mon, 26 Sep 2016 11:35:03 -0400
> > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > > Especially now that the code after the spin_unlock(&hb->lock) is now a
> > > critical section (preemption is disable). There's nothing obvious in
> > > futex.c that says it is.
> >
> > Not to mention, this looks like it will break PREEMPT_RT as wake_up_q()
> > calls sleepable spin locks.
>
> What locks would that be?

None :)

It still breaks RT in the futex case due to:

deboost = rt_mutex_futex_unlock();

spin_unlock(&hb->lock);
....
migrate_enable();
if (in_atomic())
return;

So the migrate_disable() which was emitted by spin_lock(&hb->lock) will not
be cleaned up and we leak the migrate disable count. We can work around
that, but it's not pretty.

As a related note, Sebastian decoded another possible priority inversion
issue in the futex mess.

T1 holds futex

T2 blocks on futex and boosts T1

T1 unlocks futex and holds hb->lock

T1 unlocks rt mutex, so T1 has no more pi waiters

T3 blocks on hb->lock and adds itself to the pi waiters list of T1

T1 unlocks hb->lock and deboosts itself

T4 preempts T1 so the wakeup of T2 gets delayed .....

We tried to fix it with a preempt_disable() and that's where we ran into
that migrate_enable() hickup. We have a non deboosting variant for
spin_unlock() for now, but we'll have to revisit that anyway ...

Thanks,

tglx