[Bisected] Regression: Hang on boot in schedule_timeout_interruptibleduring ACPI init on SMP

From: Andrew Drake
Date: Wed Jul 23 2008 - 18:46:59 EST


Here's a puzzler for you all,

On my laptop (an ACPI-enabled SMP system), the system hangs during the
"acpi_init" function. I traced it to the schedule_timeout_interruptible
function, which is called if Sleep() is encountered in the DSDT code in one of
the _STA or _INI functions. In this case, I have one in each, and it hangs
twice.

The value being passed to schedule_timeout_interruptible is sane (i.e. <= 25),
but the function never returns. Triggering an interrupt (i.e. jiggling the power
button) causes the boot to continue.

Passing nosmp causes the problem to disappear (but at what an expense!),

I noticed this in the latest kernel, and in some 2.6.25-ish kernels, decided to
hunt it down. On Linus's tree, the latest good commit was:

commit 1161705bd66df0c80fa45e87190e456c02e6f145
Author: Ingo Molnar <mingo@xxxxxxx>
Date: Wed Mar 19 20:26:15 2008 +0100

x86: fill cpu to apicid and present map in mpparse, fix
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>


and the earliest bad commit was:

commit 802b8133b4f78c30a2668d142d78861e27c0c6a7
Author: Glauber de Oliveira Costa <gcosta@xxxxxxxxxx>
Date: Wed Mar 19 14:25:41 2008 -0300

x86: schedule work only if keventd is already running
Only call schedule_work if keventd is already running.
This is already the way x86_64 does
Signed-off-by: Glauber Costa <gcosta@xxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

There's about 14 commits in-between these two; I was unable to bisect any
further because all 14 of the in-between commits either oops, panic, or
hang setting up the timer (it appears that the commit immediately following
the known-good one introduces the timer failure, which lasts up until the
known-bad one).

The change "x86: schedule work only if keventd is already running" modifies
smp_boot, which puts it, in my mind, as the most likely culprit.

Anybody have any ideas? I'm willing to write a patch if somebody can help me
track down the root cause.

Thanks,

Andrew

P.S. I'm willing to provide any information that you'd like to see, like
my .config or my DSDT (disassembled or otherwise). I didn't include it
in this email because I wasn't sure what would be helpful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/