Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock

From: Mukesh Ojha
Date: Tue Sep 06 2022 - 08:44:49 EST


Hi,

On 9/3/2022 2:25 AM, Waiman Long wrote:

On 9/1/22 06:28, Mukesh Ojha wrote:
From: Gokul krishna Krishnakumar <quic_gokukris@xxxxxxxxxxx>

Make the region inside the rwsem_write_trylock non preemptible.

We observe RT task is hogging CPU when trying to acquire rwsem lock
which was acquired by a kworker task but before the rwsem owner was set.

Here is the scenario:
1. CFS task (affined to a particular CPU) takes rwsem lock.

2. CFS task gets preempted by a RT task before setting owner.

3. RT task (FIFO) is trying to acquire the lock, but spinning until
RT throttling happens for the lock as the lock was taken by CFS task.

Note that the spinning is likely caused by the following code in rwsem_down_write_slowpath():

1163                 /*
1164                  * After setting the handoff bit and failing to acquire
1165                  * the lock, attempt to spin on owner to accelerate lock
1166                  * transfer. If the previous owner is a on-cpu writer and it
1167                  * has just released the lock, OWNER_NULL will be returned.
1168                  * In this case, we attempt to acquire the lock again
1169                  * without sleeping.
1170                  */
1171                 if (waiter.handoff_set) {
1172                         enum owner_state owner_state;
1173
1174                         preempt_disable();
1175                         owner_state = rwsem_spin_on_owner(sem);
1176                         preempt_enable();
1177
1178                         if (owner_state == OWNER_NULL)
1179                                 goto trylock_again;
1180                 }

rwsem_optimistic_spin() limits RT task one additional attempt if OWNER_NULL is returned. There is no such limitation in this loop. So an alternative will be to put a limit on the number of times an OWNER_NULL return values will be allowed to continue spinning without sleeping. That put the burden on the slowpath instead of in the fastpath.

Other than the slight overhead in the fastpath, the patch should work too.

Acked-by: Waiman Long <longman@xxxxxxxxxx>

Thanks Waiman for your time and suggestion.
Would like to take others opinion as well.

-Mukesh


Cheers,
Longman