Re: [PATCH] semaphore: Add might_sleep() to down_*() family

From: Xiaoming Ni
Date: Sun Aug 08 2021 - 23:51:15 EST


On 2021/8/9 11:01, Waiman Long wrote:
On 8/8/21 10:12 PM, Xiaoming Ni wrote:
Semaphore is sleeping lock. Add might_sleep() to down*() family
(with exception of down_trylock()) to detect atomic context sleep.

Previously discussed with Peter Zijlstra, see link:
https://lore.kernel.org/lkml/20210806082320.GD22037@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Signed-off-by: Xiaoming Ni <nixiaoming@xxxxxxxxxx>
---
  kernel/locking/semaphore.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 9aa855a96c4a..9ee381e4d2a4 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -54,6 +54,7 @@ void down(struct semaphore *sem)
  {
      unsigned long flags;
+    might_sleep();
      raw_spin_lock_irqsave(&sem->lock, flags);
      if (likely(sem->count > 0))
          sem->count--;
@@ -77,6 +78,7 @@ int down_interruptible(struct semaphore *sem)
      unsigned long flags;
      int result = 0;
+    might_sleep();
      raw_spin_lock_irqsave(&sem->lock, flags);
      if (likely(sem->count > 0))
          sem->count--;
@@ -103,6 +105,7 @@ int down_killable(struct semaphore *sem)
      unsigned long flags;
      int result = 0;
+    might_sleep();
      raw_spin_lock_irqsave(&sem->lock, flags);
      if (likely(sem->count > 0))
          sem->count--;
@@ -157,6 +160,7 @@ int down_timeout(struct semaphore *sem, long timeout)
      unsigned long flags;
      int result = 0;
+    might_sleep();
      raw_spin_lock_irqsave(&sem->lock, flags);
      if (likely(sem->count > 0))
          sem->count--;

I think it is simpler to just put a "might_sleep()" in __down_common() which is the function where sleep can actually happen.


If the actual atomic context hibernation occurs, the corresponding alarm log is generated in __schedule_bug().
__schedule()
--> schedule_debug()
--> __schedule_bug()

However, "might_sleep()" indicates the possibility of sleep, so that code writers can identify and fix the problem as soon as possible, but does not trigger atomic context sleep.

Is it better to put "might_sleep()" in each down API entry than __down_common() to help identify potential code problems?

Thanks
Xiaoming Ni