Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking)

From: Guenter Roeck
Date: Sun Apr 19 2015 - 00:03:55 EST


On 04/18/2015 08:39 PM, Rabin Vincent wrote:
On Sat, Apr 18, 2015 at 06:56:02PM -0700, Guenter Roeck wrote:
Further debugging (with added WARN_ON if cpu != 0 in smp_call_function_single) shows:

[<800157ec>] (unwind_backtrace) from [<8001250c>] (show_stack+0x10/0x14)
[<8001250c>] (show_stack) from [<80494cb4>] (dump_stack+0x88/0x98)
[<80494cb4>] (dump_stack) from [<80024058>] (warn_slowpath_common+0x84/0xb4)
[<80024058>] (warn_slowpath_common) from [<80024124>] (warn_slowpath_null+0x1c/0x24)
[<80024124>] (warn_slowpath_null) from [<80078fc8>] (smp_call_function_single+0x170/0x178)
[<80078fc8>] (smp_call_function_single) from [<80090024>] (perf_event_exit_cpu+0x80/0xf0)
[<80090024>] (perf_event_exit_cpu) from [<80090110>] (perf_cpu_notify+0x30/0x48)
[<80090110>] (perf_cpu_notify) from [<8003d340>] (notifier_call_chain+0x44/0x84)
[<8003d340>] (notifier_call_chain) from [<8002451c>] (_cpu_up+0x120/0x168)
[<8002451c>] (_cpu_up) from [<800245d4>] (cpu_up+0x70/0x94)
[<800245d4>] (cpu_up) from [<80624234>] (smp_init+0xac/0xb0)
[<80624234>] (smp_init) from [<80618d84>] (kernel_init_freeable+0x118/0x268)
[<80618d84>] (kernel_init_freeable) from [<8049107c>] (kernel_init+0x8/0xe8)
[<8049107c>] (kernel_init) from [<8000f320>] (ret_from_fork+0x14/0x34)
---[ end trace 2f9f1bb8a47b3a1b ]---
smp_call_function_single, cpu=1, wait=1, csd_stack=87825ea0
generic_exec_single, cpu=1, smp_processor_id()=0
csd_lock_wait: csd=87825ea0, flags=0x3

This is repeated for each secondary CPU. But the secondary CPUs don't respond because
they are not enabled, which I guess explains why the lock is never released.

So, in other words, this happens because the system believes (presumably per configuration
/ fdt data) that there are four CPU cores, but that is not really the case. Previously that
did not matter, and was handled correctly. Now it is fatal.

Does this help ?

8053871d0f7f67c7efb7f226ef031f78877d6625 moved the csd locking to the
callers, but the offline check, which was earlier done before the csd
locking, was not moved. The following moves the checks to the earlier
point fixes your boot:


Yes, your patch fixes the problem.

Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx>

Thanks,
Guenter

diff --git a/kernel/smp.c b/kernel/smp.c
index 2aaac2c..ba1fb01 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -159,9 +159,6 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
}


- if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))
- return -ENXIO;
-
csd->func = func;
csd->info = info;

@@ -289,6 +286,12 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
&& !oops_in_progress);

+ if (cpu != smp_processor_id()
+ && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) {
+ err = -ENXIO;
+ goto out;
+ }
+
csd = &csd_stack;
if (!wait) {
csd = this_cpu_ptr(&csd_data);
@@ -300,6 +303,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
if (wait)
csd_lock_wait(csd);

+out:
put_cpu();

return err;
@@ -328,6 +332,12 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)

preempt_disable();

+ if (cpu != smp_processor_id()
+ && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) {
+ err = -ENXIO;
+ goto out;
+ }
+
/* We could deadlock if we have to wait here with interrupts disabled! */
if (WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK))
csd_lock_wait(csd);
@@ -336,8 +346,9 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)
smp_wmb();

err = generic_exec_single(cpu, csd, csd->func, csd->info);
- preempt_enable();

+out:
+ preempt_enable();
return err;
}
EXPORT_SYMBOL_GPL(smp_call_function_single_async);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/