Re: next-20211015: suspend to ram on x86-32 broken

From: Peter Zijlstra
Date: Mon Oct 18 2021 - 10:57:24 EST


On Mon, Oct 18, 2021 at 01:44:29PM +0200, Pavel Machek wrote:

> Reverting smp.c hunk is enough to get suspend/resume to work.

Thanks! Queued the below.

---
Subject: sched: Partial revert: "sched: Simplify wake_up_*idle*()"
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon Oct 18 16:41:05 CEST 2021

As reported by syzbot and experienced by Pavel, using cpus_read_lock()
in wake_up_all_idle_cpus() generates lock inversion (against mmap_sem
and possibly others).

Therefore, undo this change and put in a comment :/

Fixes: 8850cb663b5c ("sched: Simplify wake_up_*idle*()")
Reported-by: syzbot+d5b23b18d2f4feae8a67@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: Pavel Machek <pavel@xxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Tested-by: Pavel Machek <pavel@xxxxxx>
---
kernel/smp.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1170,14 +1170,23 @@ void wake_up_all_idle_cpus(void)
{
int cpu;

- cpus_read_lock();
+ /*
+ * This really should be cpus_read_lock(), because disabling preemption
+ * over iterating all CPUs is really bad when you have large numbers of
+ * CPUs, giving rise to large latencies.
+ *
+ * Sadly this cannot be, since (ironically) this function is used from
+ * the cpu_latency_qos stuff which in turn is used under all sorts of
+ * locks yielding a hotplug lock inversion :/
+ */
+ preempt_disable();
for_each_online_cpu(cpu) {
- if (cpu == raw_smp_processor_id())
+ if (cpu == smp_processor_id())
continue;

wake_up_if_idle(cpu);
}
- cpus_read_unlock();
+ preempt_enable();
}
EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);