Re: [PATCH] futex: Avoid reusing outdated pi_state.

From: Thomas Gleixner
Date: Wed Jan 17 2024 - 13:37:14 EST


On Tue, Jan 16 2024 at 14:08, Sebastian Andrzej Siewior wrote:
> @@ -628,10 +628,15 @@ int futex_unqueue(struct futex_q *q)
> /*
> * PI futexes can not be requeued and must remove themselves from the
> * hash bucket. The hash bucket lock (i.e. lock_ptr) is held.
> + * If the PI futex was not acquired (due to timeout or signal) then it removes
> + * its rt_waiter before it removes itself from the futex queue. The unlocker
> + * will remove the futex_q from the queue if it observes an empty waitqueue.
> + * Therefore the unqueue is optional in this case.

This explanation is as confusing as the changelog.

> */
> -void futex_unqueue_pi(struct futex_q *q)
> +void futex_unqueue_pi(struct futex_q *q, bool have_lock)
> {
> - __futex_unqueue(q);
> + if (have_lock || !plist_node_empty(&q->list))
> + __futex_unqueue(q);

If 'have_lock == true' then 'plist_node_empty()' must be 'false' with
you moving the callsite up, no?

So that 'have_lock' arguments is clearly pointless.

> BUG_ON(!q->pi_state);
> put_pi_state(q->pi_state);
> diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
> index 8b195d06f4e8e..c7133ffb381fd 100644
> --- a/kernel/futex/futex.h
> +++ b/kernel/futex/futex.h
> @@ -252,7 +252,7 @@ static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
> spin_unlock(&hb->lock);
> }
>
> -extern void futex_unqueue_pi(struct futex_q *q);
> +extern void futex_unqueue_pi(struct futex_q *q, bool have_lock);
>
> extern void wait_for_owner_exiting(int ret, struct task_struct *exiting);
>
> diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c
> index 90e5197f4e569..4023841358eea 100644
> --- a/kernel/futex/pi.c
> +++ b/kernel/futex/pi.c
> @@ -1070,6 +1070,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int tryl
> * haven't already.
> */
> res = fixup_pi_owner(uaddr, &q, !ret);
> + futex_unqueue_pi(&q, !ret);
> /*
> * If fixup_pi_owner() returned an error, propagate that. If it acquired
> * the lock, clear our -ETIMEDOUT or -EINTR.
> @@ -1077,7 +1078,6 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int tryl
> if (res)
> ret = (res < 0) ? res : 0;
>
> - futex_unqueue_pi(&q);

Without the have_lock argument these two hunks are not required.

> spin_unlock(q.lock_ptr);
> goto out;
>
> @@ -1135,6 +1135,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
>
> hb = futex_hash(&key);
> spin_lock(&hb->lock);
> +retry_hb:
>
> /*
> * Check waiters first. We do not trust user space values at
> @@ -1177,12 +1178,15 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
> /*
> * Futex vs rt_mutex waiter state -- if there are no rt_mutex
> * waiters even though futex thinks there are, then the waiter
> - * is leaving and the uncontended path is safe to take.
> + * is leaving. We need to remove it from the list so that the
> + * current PI-state is not observed by future pi_futex_lock()
> + * caller before the leaving waiter had a chance to clean up.
> */
> rt_waiter = rt_mutex_top_waiter(&pi_state->pi_mutex);
> if (!rt_waiter) {
> + __futex_unqueue(top_waiter);
> raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
> - goto do_uncontended;
> + goto retry_hb;

This clearly lacks a comment that there might be more than one waiter in
the hash-bucket which removed itself from the rtmutex and is now blocked
on the hash bucket lock.

Thanks,

tglx