SMP poweroff hangs: it's baaaack! But on x86_64 this time.

From: Mark Lord
Date: Wed Dec 17 2008 - 10:46:57 EST


Subject: Fix SMP poweroff hangs
From: Mark Lord <lkml@xxxxxx>

We need to disable all CPUs other than the boot CPU (usually 0) before
attempting to power-off modern SMP machines. This fixes the
hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
new toybox.

Signed-off-by: Mark Lord <mlord@xxxxxxxxx>
Acked-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: "Rafael J. Wysocki" <rjw@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

kernel/sys.c | 2 ++
1 file changed, 2 insertions(+)

diff -puN kernel/sys.c~fix-smp-poweroff-hangs kernel/sys.c
--- a/kernel/sys.c~fix-smp-poweroff-hangs
+++ a/kernel/sys.c
@@ -32,6 +32,7 @@
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
+#include <linux/cpu.h>
#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -878,6 +879,7 @@ void kernel_power_off(void)
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
+ disable_nonboot_cpus();
sysdev_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
..

This bug has returned here now, but on x86_86 this time around.
Same machine as before, just upgraded toa 64-bit kernel/user (2.6.27.9)
from the original 32-bit kernel/user that was originally fixed (above).

One hang at poweroff over the past 10 days. Not much, but enough
to destroy confidence in "unattended" operation.

I lack opportunity to dig further into the code for now,
but just wanted to flag the problem, in case similar reports
from others might already be out there.

In the meanwhile, I'm experimenting with this simple patch,
garnered from the 32-bit investigations last time around.
We should know in a few weeks whether it has any effect or not.

--- old/kernel/sys.c 2008-10-18 13:57:22.000000000 -0400
+++ linux-2.6.27.9/kernel/sys.c 2008-12-17 09:42:17.000000000 -0500
@@ -303,6 +303,8 @@

static void kernel_shutdown_prepare(enum system_states state)
{
+ set_cpus_allowed(current, cpumask_of_cpu(first_cpu(cpu_online_map)));
+ disable_nonboot_cpus();
blocking_notifier_call_chain(&reboot_notifier_list,
(state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL);
system_state = state;
@@ -333,7 +335,6 @@
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
- disable_nonboot_cpus();
sysdev_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/