Re: [PATCH] locking/mutex: fix the MUTEX_FLAG_HANDOFF bit is cleared unexpected

From: Xu, Yanfei
Date: Tue Jun 29 2021 - 05:52:46 EST




On 6/29/21 1:57 AM, Waiman Long wrote:
[Please note: This e-mail is from an EXTERNAL e-mail address]

On 6/28/21 11:51 AM, Yanfei Xu wrote:
When the mutex unlock path is excuted with WAITERS bit and without
HANDOFF bit set, it will wake up the first task in wait_list. If
there are some tasks not in wait_list are stealing lock, it is very
likely successfully due to the task field of lock is NULL and flags
field is non-NULL. Then the HANDOFF bit will be cleared. But if the
HANDOFF bit was just set by the waked task in wait_list, this clearing
is unexcepted.

I think you mean "unexpected". Right? Anyway, all the setting and

Right. It's my fault.

clearing of the HANDOFF bit are atomic. There shouldn't be any
unexpected clearing.

__mutex_lock_common                   __mutex_lock_common
   __mutex_trylock                       schedule_preempt_disabled
     /*steal lock successfully*/ __mutex_set_flag(lock,MUTEX_FLAG_HANDOFF)
     __mutex_trylock_or_owner
       if (task==NULL)
       flags &= ~MUTEX_FLAG_HANDOFF
       atomic_long_cmpxchg_acquire
                                         __mutex_trylock  //failed
                                         mutex_optimistic_spin  //failed
                                         schedule_preempt_disabled //sleep without HANDOFF bit

So the HANDOFF bit should be set as late as possible, here we defer
it util the task is going to be scheduled.
Signed-off-by: Yanfei Xu <yanfei.xu@xxxxxxxxxxxxx>
---

Hi maintainers,

I am not very sure if I missed or misunderstanded something, please help
to review. Many thanks!

  kernel/locking/mutex.c | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 013e1b08a1bf..e57d920e96bf 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -1033,17 +1033,17 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
              }

              spin_unlock(&lock->wait_lock);
+
+             if (first)
+                     __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
              schedule_preempt_disabled();

              /*
               * ww_mutex needs to always recheck its position since its waiter
               * list is not FIFO ordered.
               */
-             if (ww_ctx || !first) {
+             if (ww_ctx || !first)
                      first = __mutex_waiter_is_first(lock, &waiter);
-                     if (first)
-                             __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
-             }

              set_current_state(state);
              /*

In general, I don't mind setting the HANDOFF bit later, but
mutex_optimistic_spin() at the end of the loop should only be called
after the HANDOFF bit is set. So the logic isn't quite right yet.

Thanks for your reply.

Why the mutex_optimistic_spin should be called after the HANDOFF bit is
set? I think the HANDOFF bit is only related to unlock path, and the
mutex_optimistic_spin and __mutex_trylock don't actually use it. (Or I
missed something? )

Maybe I described the issue not much clearly. Let me try again.

The HANDOFF bit aims to avoid lock starvation. Lock starvation is
possible because mutex_lock() allows lock stealing, where a runing( or
optimistic spinning) task beats the woken waiter to the acquire. So
maintainer add HANDOFF bit, once the stealing happened, the top-waiter
will must get the lock at the second wake up due to the HANDOFF bit.

However, the fact is if the stealing happened, the HANDOFF will must be
clear by the thief task. Hence the top-waiter still might starve at the
second wake up.

__mutex_trylock
->__mutex_trylock_or_owner
->flags &= ~MUTEX_FLAG_HANDOFF
->atomic_long_cmpxchg_acquire(&lock->owner, owner, curr | flags)

The next unlock path will not see HANDOFF bit, and will not give the
lock to the top-waiter. The task starves.

May we could add a parameter to the __mutex_trylock to judge if the task
is the top-waiter. If yes, pickup the lock and clear MUTEX_FLAG_HANDOFF,
if not, do not clear MUTEX_FLAG_HANDOFF. Like this:

__mutex_trylock
->__mutex_trylock_or_owner
->if(first) flags &= ~MUTEX_FLAG_HANDOFF
->atomic_long_cmpxchg_acquire(&lock->owner, owner, curr | flags)

But I think the cost of this is higher than my patch.


The other problem my patch solved is separate __mutex_set_flag(lock,
MUTEX_FLAG_HANDOFF) out of the if statement which is "if (ww_ctx ||
!first) ". Or once the HANDOFF is cleared by thief task, it will never
be set again.

if (ww_ctx || !first) {
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
}

change to

if (ww_ctx || !first)
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);



Thanks,
Yanfei








Cheers,
Longman