Re: locking/rwsem: RT throttling issue due to RT task hogging the cpu

From: Mukesh Ojha
Date: Tue Sep 27 2022 - 11:03:55 EST


I was thinking if below patch can also help on this issue.

--------------------------------->O---------------------------

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 65f0262..dbe3e16 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -628,8 +628,8 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
new = count;

if (count & RWSEM_LOCK_MASK) {
- if (has_handoff || (!rt_task(waiter->task) &&
- !time_after(jiffies, waiter->timeout)))
+ if (has_handoff || (rt_task(waiter->task) && waiter != first) ||
+ (!rt_task(waiter->task) && !time_after(jiffies, waiter->timeout)))
return false;


-Mukesh


On 9/26/2022 5:16 PM, Mukesh Ojha wrote:
Hi,

Any comments on this issue would be helpful.

Thanks,
Mukesh

On 9/20/2022 9:49 PM, Mukesh Ojha wrote:
Hi,

We are observing one issue where, sem->owner is not set and sem->count=6 [1] which means both RWSEM_FLAG_WAITERS and RWSEM_FLAG_HANDOFF bits are set. And if unfold the sem->wait_list we see the following order of process waiting  [2] where [a] is waiting for write, while [b],[c] are waiting for read and [d] is the RT task for which waiter.handoff_set=true and it is continuously running on cpu7 and not letting the first write waiter [a] on cpu7.

[1]

   sem = 0xFFFFFFD57DDC6680 -> (
     count = (counter = 6),
     owner = (counter = 0),

[2]

[a] kworker/7:0 pid: 32516 ==> [b] iptables-restor pid: 18625 ==> [c]HwBinder:1544_3  pid: 2024 ==> [d] RenderEngine pid: 2032 cpu: 7 prio:97 (RT task)


Sometime back, Waiman has suggested this which could help in RT task
leaving the cpu.

https://lore.kernel.org/all/8c33f989-8870-08c6-db12-521de634b34e@xxxxxxxxxx/

--------------------------------->O----------------------------

 From c6493edd7a5e4f597ea55ff0eb3f1d763b335dfc Mon Sep 17 00:00:00 2001
   2 From: Waiman Long <longman@xxxxxxxxxx>
   3 Date: Tue, 20 Sep 2022 20:50:45 +0530
   4 Subject: [PATCH] locking/rwsem: Yield the cpu after doing handoff optimistic
   5  spinning
   6
   7 It is possible the new lock owner (writer) can be preempted before setting
   8 the owner field and if the current(e.g RT task) waiter is the task that
   9 preempts the new lock owner, it will hand_off spin loop for a long time.
  10 Avoid wasting cpu time and delaying the release of the lock by yielding
  11 the cpu if handoff optimistic spinning has been done multiple times with
  12 NULL owner.
  13
  14 Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
  15 Signed-off-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
  16 ---
  17  kernel/locking/rwsem.c | 15 ++++++++++++++-
  18  1 file changed, 14 insertions(+), 1 deletion(-)
  19
  20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
  21 index 65f0262..a875758 100644
  22 --- a/kernel/locking/rwsem.c
  23 +++ b/kernel/locking/rwsem.c
  24 @@ -361,6 +361,8 @@ enum rwsem_wake_type {
  25   */
  26  #define MAX_READERS_WAKEUP     0x100
  27
  28 +#define MAX_HANDOFF_SPIN       10
  29 +
  30  static inline void
  31  rwsem_add_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter)
  32  {
  33 @@ -1109,6 +1111,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
  34  {
  35         struct rwsem_waiter waiter;
  36         DEFINE_WAKE_Q(wake_q);
  37 +       int handoff_spins = 0;
  38
  39         /* do optimistic spinning and steal lock if possible */
  40         if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
  41 @@ -1167,6 +1170,14 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
  42                  * has just released the lock, OWNER_NULL will be returned.
  43                  * In this case, we attempt to acquire the lock again
  44                  * without sleeping.
  45 +                *
  46 +                * It is possible the new lock owner (writer) can be preempted
  47 +                * before setting the owner field and if the current(e.g RT task)
  48 +                * waiter is the task that preempts the new lock owner, it will
  49 +                * spin in this loop for a long time. Avoid wasting cpu time
  50 +                * and delaying the release of the lock by yielding the cpu if
  51 +                * handoff optimistic spinning has been done multiple times with
  52 +                * NULL owner.
  53                  */
  54                 if (waiter.handoff_set) {
  55                         enum owner_state owner_state;
  56 @@ -1175,8 +1186,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
  57                         owner_state = rwsem_spin_on_owner(sem);
  58                         preempt_enable();
  59
  60 -                       if (owner_state == OWNER_NULL)
  61 +                       if ((owner_state == OWNER_NULL) && (handoff_spins < MAX_HANDOFF_SPIN)) {
  62 +                               handoff_spins++;
  63                                 goto trylock_again;
  64 +                       }
  65                 }
  66
  67                 schedule();
  68 --
  69 2.7.4
  70


-Mukesh