Re: [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx

From: Thomas Hellström
Date: Thu Dec 16 2021 - 08:27:36 EST


Hi, Rafael,

On 11/4/21 18:26, Rafael J. Wysocki wrote:
It is generally unsafe to call put_device() with dpm_list_mtx held,
because the given device's release routine may carry out an action
depending on that lock which then may deadlock, so modify the
system-wide suspend and resume of devices to always drop dpm_list_mtx
before calling put_device() (and adjust white space somewhat while
at it).

For instance, this prevents the following splat from showing up in
the kernel log after a system resume in certain configurations:


<snip>


@@ -1748,21 +1769,27 @@ int dpm_suspend(pm_message_t state)
struct device *dev = to_device(dpm_prepared_list.prev);
get_device(dev);
+
mutex_unlock(&dpm_list_mtx);
error = device_suspend(dev);
mutex_lock(&dpm_list_mtx);
+
if (error) {
pm_dev_err(dev, state, "", error);
dpm_save_failed_dev(dev_name(dev));
- put_device(dev);
- break;
- }
- if (!list_empty(&dev->power.entry))
+ } else if (!list_empty(&dev->power.entry)) {
list_move(&dev->power.entry, &dpm_suspended_list);
+ }
+
+ mutex_unlock(&dpm_list_mtx);
+
put_device(dev);
- if (async_error)
+
+ mutex_lock(&dpm_list_mtx);
+
+ if (error || async_error)
break;
}
mutex_unlock(&dpm_list_mtx);
@@ -1879,6 +1906,7 @@ int dpm_prepare(pm_message_t state)
struct device *dev = to_device(dpm_list.next);
get_device(dev);
+
mutex_unlock(&dpm_list_mtx);
trace_device_pm_callback_start(dev, "", state.event);
@@ -1886,21 +1914,23 @@ int dpm_prepare(pm_message_t state)
trace_device_pm_callback_end(dev, error);
mutex_lock(&dpm_list_mtx);
- if (error) {
- if (error == -EAGAIN) {
- put_device(dev);
- error = 0;
- continue;
- }
+
+ if (!error) {
+ dev->power.is_prepared = true;
+ if (!list_empty(&dev->power.entry))
+ list_move_tail(&dev->power.entry, &dpm_prepared_list);
+ } else if (error == -EAGAIN) {
+ error = 0;
+ } else {
dev_info(dev, "not prepared for power transition: code %d\n",
error);
- put_device(dev);
- break;

It appears the above break disappeared.


}
- dev->power.is_prepared = true;
- if (!list_empty(&dev->power.entry))
- list_move_tail(&dev->power.entry, &dpm_prepared_list);
+
+ mutex_unlock(&dpm_list_mtx);
+
put_device(dev);

Should be

                 if (error)

                        break;

Here?

Symptoms is if we return an error from the device prepare callback, we end up spinning forever with little clue what's going on.


+
+ mutex_lock(&dpm_list_mtx);
}
mutex_unlock(&dpm_list_mtx);
trace_suspend_resume(TPS("dpm_prepare"), state.event, false);

Thanks,

Thomas