[RFC][PATCH 3/3] test-ww_mutex: Make sure we bail out instead of livelock

From: John Stultz
Date: Tue Aug 08 2023 - 14:40:05 EST


I've seen what appears to be livelocks in the stress_inorder_work()
function, and looking at the code it is clear we can have a case
where we continually retry acquiring the locks and never check to
see if we have passed the specified timeout.

This patch reworks that function so we always check the timeout
before iterating through the loop again.

I believe others may have hit this previously here:
https://lore.kernel.org/lkml/895ef450-4fb3-5d29-a6ad-790657106a5a@xxxxxxxxx/

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
Cc: "Paul E . McKenney" <paulmck@xxxxxxxxxx>
Cc: Joel Fernandes <joelaf@xxxxxxxxxx>
Cc: Li Zhijian <zhijianx.li@xxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: kernel-team@xxxxxxxxxxx
Reported-by: Li Zhijian <zhijianx.li@xxxxxxxxx>
Link: https://lore.kernel.org/lkml/895ef450-4fb3-5d29-a6ad-790657106a5a@xxxxxxxxx/
Signed-off-by: John Stultz <jstultz@xxxxxxxxxx>
---
kernel/locking/test-ww_mutex.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 358d66150426..78719e1ef1b1 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -465,17 +465,18 @@ static void stress_inorder_work(struct work_struct *work)
ww_mutex_unlock(&locks[order[n]]);

if (err == -EDEADLK) {
- ww_mutex_lock_slow(&locks[order[contended]], &ctx);
- goto retry;
+ if (!time_after(jiffies, stress->timeout)) {
+ ww_mutex_lock_slow(&locks[order[contended]], &ctx);
+ goto retry;
+ }
}

+ ww_acquire_fini(&ctx);
if (err) {
pr_err_once("stress (%s) failed with %d\n",
__func__, err);
break;
}
-
- ww_acquire_fini(&ctx);
} while (!time_after(jiffies, stress->timeout));

kfree(order);
--
2.41.0.640.ga95def55d0-goog