Re: [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx

From: Rafael J. Wysocki
Date: Thu Dec 16 2021 - 13:24:22 EST


On 12/16/2021 2:27 PM, Thomas Hellström wrote:
Hi, Rafael,

On 11/4/21 18:26, Rafael J. Wysocki wrote:
It is generally unsafe to call put_device() with dpm_list_mtx held,
because the given device's release routine may carry out an action
depending on that lock which then may deadlock, so modify the
system-wide suspend and resume of devices to always drop dpm_list_mtx
before calling put_device() (and adjust white space somewhat while
at it).

For instance, this prevents the following splat from showing up in
the kernel log after a system resume in certain configurations:


<snip>


@@ -1748,21 +1769,27 @@ int dpm_suspend(pm_message_t state)
          struct device *dev = to_device(dpm_prepared_list.prev);
            get_device(dev);
+
          mutex_unlock(&dpm_list_mtx);
            error = device_suspend(dev);
            mutex_lock(&dpm_list_mtx);
+
          if (error) {
              pm_dev_err(dev, state, "", error);
              dpm_save_failed_dev(dev_name(dev));
-            put_device(dev);
-            break;
-        }
-        if (!list_empty(&dev->power.entry))
+        } else if (!list_empty(&dev->power.entry)) {
              list_move(&dev->power.entry, &dpm_suspended_list);
+        }
+
+        mutex_unlock(&dpm_list_mtx);
+
          put_device(dev);
-        if (async_error)
+
+        mutex_lock(&dpm_list_mtx);
+
+        if (error || async_error)
              break;
      }
      mutex_unlock(&dpm_list_mtx);
@@ -1879,6 +1906,7 @@ int dpm_prepare(pm_message_t state)
          struct device *dev = to_device(dpm_list.next);
            get_device(dev);
+
          mutex_unlock(&dpm_list_mtx);
            trace_device_pm_callback_start(dev, "", state.event);
@@ -1886,21 +1914,23 @@ int dpm_prepare(pm_message_t state)
          trace_device_pm_callback_end(dev, error);
            mutex_lock(&dpm_list_mtx);
-        if (error) {
-            if (error == -EAGAIN) {
-                put_device(dev);
-                error = 0;
-                continue;
-            }
+
+        if (!error) {
+            dev->power.is_prepared = true;
+            if (!list_empty(&dev->power.entry))
+                list_move_tail(&dev->power.entry, &dpm_prepared_list);
+        } else if (error == -EAGAIN) {
+            error = 0;
+        } else {
              dev_info(dev, "not prepared for power transition: code %d\n",
                   error);
-            put_device(dev);
-            break;

It appears the above break disappeared.


          }
-        dev->power.is_prepared = true;
-        if (!list_empty(&dev->power.entry))
-            list_move_tail(&dev->power.entry, &dpm_prepared_list);
+
+        mutex_unlock(&dpm_list_mtx);
+
          put_device(dev);

Should be

                 if (error)

                        break;

Here?

Looks like that.


Symptoms is if we return an error from the device prepare callback, we end up spinning forever with little clue what's going on.


+
+        mutex_lock(&dpm_list_mtx);
      }
      mutex_unlock(&dpm_list_mtx);
      trace_suspend_resume(TPS("dpm_prepare"), state.event, false);

I'll have a look, thanks for the report!