Re: [PATCH RFC] smp: Add cpu unstopped mask for smp_send_stop/stop_other_cpus

From: Thomas Gleixner
Date: Tue Aug 20 2019 - 07:25:23 EST


On Tue, 20 Aug 2019, Hsin-Yi Wang wrote:

> In arm/arm64/x86, reboot IPI function uses CPU online mask to let
> primary CPU know how many secondary CPUs it has to wait for in
> smp_send_stop()/native_stop_other_cpus().
>
> However, sometimes this would trigger unnecessary warnings, since
> interrupts and tasks might fall on a CPU that has already executed
> the reboot ipi function. This is fine since CPU is already in spinloop.
> But warnings are generated since it finds that the CPU is marked as
> offiline. The warnings are supposed to catch failures in normal hotplug
> offline CPUs, and reboot isn't a regular hotplug. So instead of reusing
> online masks, we should use a new mask in reboot IPI functions to do the
> work.
>
> Take tick broadcast for example. If broadcast and smp_send_stop()
> happen together, most of the time, the CPU getting earliest broadcast
> is already in spinloop and thus won't do anything. If the first
> broadcast arrives to CPU that hasn't already executed reboot ipi, it
> would try to IPI another CPU, but the CPU is already marked as offline,
> and warning comes out:
>
> [ 22.481523] reboot: Restarting system
> [ 22.481608] WARNING: CPU: 4 PID: 0 at ...

That is really the complete wrong approach. There is no valid reason that a
regular reboot needs to use a shortcut homebrewn variant of stopping CPUs.

The proper solution is to restrict this mechansim to emergency reboots and
let the normal reboot go through the regular CPU hotplug mechanism. That
avoids all that duct tape which is just bound to break tomorrow again.

In case of an emergency reboot, we really do not care about any extra stuff
triggered. The machine is hosed already.

Thanks

tglx