[patch V2 4/8] x86/smp: Acquire stopping_cpu unconditionally

From: Thomas Gleixner
Date: Tue Jun 13 2023 - 08:18:13 EST


There is no reason to acquire the stopping_cpu atomic_t only when there is
more than one online CPU.

Make it unconditional to prepare for fixing the kexec() problem when there
are present but "offline" CPUs which play dead in mwait_play_dead().

They need to be brought out of mwait before kexec() as kexec() can
overwrite text, pagetables, stacks and the monitored cacheline of the
original kernel. The latter causes mwait to resume execution which
obviously causes havoc on the kexec kernel which results usually in triple
faults.

Move the acquire out of the num_online_cpus() > 1 condition so the upcoming
'kick mwait' fixup is properly protected.

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reviewed-by: Ashok Raj <ashok.raj@xxxxxxxxx>
---
arch/x86/kernel/smp.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -153,6 +153,12 @@ static void native_stop_other_cpus(int w
if (reboot_force)
return;

+ /* Only proceed if this is the first CPU to reach this code */
+ if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
+ return;
+
+ atomic_set(&stop_cpus_count, num_online_cpus() - 1);
+
/*
* Use an own vector here because smp_call_function
* does lots of things not suitable in a panic situation.
@@ -167,13 +173,7 @@ static void native_stop_other_cpus(int w
* code. By syncing, we give the cpus up to one second to
* finish their work before we force them off with the NMI.
*/
- if (num_online_cpus() > 1) {
- /* did someone beat us here? */
- if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
- return;
-
- atomic_set(&stop_cpus_count, num_online_cpus() - 1);
-
+ if (atomic_read(&stop_cpus_count) > 0) {
apic_send_IPI_allbutself(REBOOT_VECTOR);

/*