Re: [patch V2 4/8] x86/smp: Acquire stopping_cpu unconditionally

From: Peter Zijlstra
Date: Thu Jun 15 2023 - 05:03:11 EST


On Tue, Jun 13, 2023 at 02:17:59PM +0200, Thomas Gleixner wrote:
> There is no reason to acquire the stopping_cpu atomic_t only when there is
> more than one online CPU.
>
> Make it unconditional to prepare for fixing the kexec() problem when there
> are present but "offline" CPUs which play dead in mwait_play_dead().
>
> They need to be brought out of mwait before kexec() as kexec() can
> overwrite text, pagetables, stacks and the monitored cacheline of the
> original kernel. The latter causes mwait to resume execution which
> obviously causes havoc on the kexec kernel which results usually in triple
> faults.
>
> Move the acquire out of the num_online_cpus() > 1 condition so the upcoming
> 'kick mwait' fixup is properly protected.
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Reviewed-by: Ashok Raj <ashok.raj@xxxxxxxxx>
> ---
> arch/x86/kernel/smp.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -153,6 +153,12 @@ static void native_stop_other_cpus(int w
> if (reboot_force)
> return;
>
> + /* Only proceed if this is the first CPU to reach this code */
> + if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
> + return;
> +
> + atomic_set(&stop_cpus_count, num_online_cpus() - 1);
> +

if (({ int old = -1; !atomic_try_cmpxchg(&stopping_cpu, &old, safe_smp_processor_id()); }))
return;

Doesn't really roll of the tongue, does it :/

Also, I don't think anybody cares about performance at this point, so
ignore I wrote this email.

/me presses send anyway.