Re: [PATCH] powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode

From: Srivatsa S. Bhat
Date: Fri Jun 06 2014 - 08:32:49 EST


On 06/04/2014 07:11 PM, Vivek Goyal wrote:
> On Wed, Jun 04, 2014 at 01:58:40AM +0530, Srivatsa S. Bhat wrote:
>> On 05/28/2014 07:01 PM, Vivek Goyal wrote:
>>> On Tue, May 27, 2014 at 04:25:34PM +0530, Srivatsa S. Bhat wrote:
>>>> If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
>>>> (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
>>>> get the following messages during boot:
>>>>
[...]
>>>> diff --git a/kernel/kexec.c b/kernel/kexec.c
>>>> index c8380ad..28c5706 100644
>>>> --- a/kernel/kexec.c
>>>> +++ b/kernel/kexec.c
>>>> @@ -1683,6 +1683,14 @@ int kernel_kexec(void)
>>>> kexec_in_progress = true;
>>>> kernel_restart_prepare(NULL);
>>>> migrate_to_reboot_cpu();
>>>> +
>>>> + /*
>>>> + * migrate_to_reboot_cpu() disables CPU hotplug assuming that
>>>> + * no further code needs to use CPU hotplug (which is true in
>>>> + * the reboot case). However, the kexec path depends on using
>>>> + * CPU hotplug again; so re-enable it here.
>>>> + */
>>>> + cpu_hotplug_enable();
>>>> printk(KERN_EMERG "Starting new kernel\n");
>>>> machine_shutdown();
>>>
>>> After migrate_to_reboot_cpu(), we are calling machine_shutdown() which
>>> calls disable_nonboot_cpus() and which in turn calls _cpu_down().
>>>
>>
>> Hmm? I see only 'arm' calling disable_nonboot_cpus() from machine_shutdown().
>> None of the other architectures call it. Is that a leftover in arm?
>
> You are right. I did not notice that only arm is doing that. Looks like
> it is calling into some platform code, I am not sure what exactly arm
> does for disabling cpu.
>
> x86 code calls stop_other_cpus() in machine_shutdown() which sends
> REBOOT_VECTOR to other cpus and calls stop_this_cpu() which in turn
> does.
>
> for (;;)
> halt();
>
> IIUC, upon receipt of certain interrupts cpu will come out of halt state.
> Not sure how safe it is from kexec point of view as we will be replacing
> original kernel that means if cpu comes out of halt state it might be
> running some random code.
>
> Eric/hpa might know better the context here and what safeguards us on x86.
>
> So one should not make cpu spin on some code as kexec will change that
> code. It should be some other platform specific mechanism which brings
> cpu in to hlt like state. So that way arm seems to be doing right thing.
>
> I am not sure what powerpc does to stop cpus.
>

powerpc shepherds all CPUs to a safe state, by making them run kexec_smp_down(),
and eventually those CPUs end up calling kexec_wait() in assembly.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/