Re: [tip:smp/hotplug] cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending

From: Boqun Feng
Date: Thu Mar 26 2020 - 22:53:20 EST


Hi Thomas and Pavankumar,

I have a question about this patch, please see below:

On Wed, Jun 12, 2019 at 05:34:08AM -0700, tip-bot for Pavankumar Kondeti wrote:
> Commit-ID: a66d955e910ab0e598d7a7450cbe6139f52befe7
> Gitweb: https://git.kernel.org/tip/a66d955e910ab0e598d7a7450cbe6139f52befe7
> Author: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
> AuthorDate: Mon, 3 Jun 2019 10:01:03 +0530
> Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CommitDate: Wed, 12 Jun 2019 11:03:05 +0200
>
> cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
>
> When "deep" suspend is enabled, all CPUs except the primary CPU are frozen
> via CPU hotplug one by one. After all secondary CPUs are unplugged the
> wakeup pending condition is evaluated and if pending the suspend operation
> is aborted and the secondary CPUs are brought up again.
>
> CPU hotplug is a slow operation, so it makes sense to check for wakeup
> pending in the freezer loop before bringing down the next CPU. This
> improves the system suspend abort latency significantly.
>

>From the commit message, it makes sense to add the pm_wakeup_pending()
check if freeze_secondary_cpus() is used for system suspend. However,
freeze_secondary_cpus() is also used in kexec path on arm64:

kernel_kexec():
machine_shutdown():
disable_nonboot_cpus():
freeze_secondary_cpus()

, so I wonder whether the pm_wakeup_pending() makes sense in this
situation? Because IIUC, in this case we want to reboot the system
regardlessly, the pm_wakeup_pending() checking seems to be inappropriate
then.

I'm asking this because I'm debugging a kexec failure on ARM64 guest on
Hyper-V, and I got the BUG_ON() triggered:

[ 108.378016] kexec_core: Starting new kernel
[ 108.378018] Disabling non-boot CPUs ...
[ 108.378019] Wakeup pending. Abort CPU freeze
[ 108.378020] Non-boot CPUs are not disabled
[ 108.378049] ------------[ cut here ]------------
[ 108.378050] kernel BUG at arch/arm64/kernel/machine_kexec.c:154!

Thanks!

Regards,
Boqun

> [ tglx: Massaged changelog and improved printk message ]
>
> Signed-off-by: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> Cc: Len Brown <len.brown@xxxxxxxxx>
> Cc: Pavel Machek <pavel@xxxxxx>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: iri Kosina <jkosina@xxxxxxx>
> Cc: Mukesh Ojha <mojha@xxxxxxxxxxxxxx>
> Cc: linux-pm@xxxxxxxxxxxxxxx
> Link: https://lkml.kernel.org/r/1559536263-16472-1-git-send-email-pkondeti@xxxxxxxxxxxxxx
>
> ---
> kernel/cpu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index be82cbc11a8a..0778249cd49d 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1221,6 +1221,13 @@ int freeze_secondary_cpus(int primary)
> for_each_online_cpu(cpu) {
> if (cpu == primary)
> continue;
> +
> + if (pm_wakeup_pending()) {
> + pr_info("Wakeup pending. Abort CPU freeze\n");
> + error = -EBUSY;
> + break;
> + }
> +
> trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> trace_suspend_resume(TPS("CPU_OFF"), cpu, false);