[PATCH] fix cpu hotplug test failures on powerpc

From: Xiaotian Feng
Date: Wed Dec 16 2009 - 04:16:39 EST


Sachin found cpu hotplug test failures on powerpc, which made kernel
hangs on his POWER box. This is addressed in
http://marc.info/?l=linux-kernel&m=126052886204649&w=2

commit 6ad4c18(sched: Fix balance vs hotplug race), switches to
cpu_active_mask, but at some specific situation, kernel may cause
some cpu inactive but online.

In some powerpc machine, hotplug cpu0 is allowed. If cpu0 is the
last alive cpu, when we tried to offline cpu0, we'll inactive cpu0
in cpu_down(), after goes into __cpu_down(), kernel found num_online_cpus
is 1, returned -EBUSY but cpu0 is not changed back to active. So
cpu0 is inactive but online.

The fix is to set cpu inactive when we're going to bring down the specific
cpu in _cpu_down().

Reported-by: Sachin Sant <sachinp@xxxxxxxxxx>
Signed-off-by: Xiaotian Feng <dfeng@xxxxxxxxxx>
Tested-by: Sachin Sant <sachinp@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Heiko Carstens <heiko.carstens@xxxxxxxxxx>
---
kernel/cpu.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 291ac58..a1e7165 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -209,6 +209,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
return -ENOMEM;

cpu_hotplug_begin();
+ set_cpu_active(cpu, false);
err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod,
hcpu, -1, &nr_calls);
if (err == NOTIFY_BAD) {
@@ -280,8 +281,6 @@ int __ref cpu_down(unsigned int cpu)
goto out;
}

- set_cpu_active(cpu, false);
-
/*
* Make sure the all cpus did the reschedule and are not
* using stale version of the cpu_active_mask.
@@ -387,12 +386,6 @@ int disable_nonboot_cpus(void)
*/
cpumask_clear(frozen_cpus);

- for_each_online_cpu(cpu) {
- if (cpu == first_cpu)
- continue;
- set_cpu_active(cpu, false);
- }
-
synchronize_sched();

printk("Disabling non-boot CPUs ...\n");
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/