Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

From: Vegard Nossum
Date: Sat Aug 30 2008 - 11:30:26 EST


On Sat, Aug 30, 2008 at 2:46 PM, Thomas Backlund <tmb@xxxxxxxxxxxx> wrote:
> Hi, (please cc me as I'm not subscribed)
>
> thought I would post this right now, I'll try to reproduce it with vanilla
> 2.6.27-rc5 as soon as the buildhost is back up...
>
> (vanilla 2.6.27-rc5 x86_64 also locked up my laptop wich is a
> Core2 Duo T8300 during kernel build with make -j3, but
> I dont have any logs of that yet)
>
> Kernel: 2.6.27-rc5 + Mandriva patches
> Config:
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/branches/rebase-to-2.6.27/PATCHES/configs/x86_64.config?revision=276760&view=markup
> Arch: x86_64
>
> System: Intel Quad Core Q9300
>
> getting this with netconsole while building kernel rpms with make -j5:

Hi,

I tried your recipe. Well, not exactly. But I also got this during a
kernel build. Notice that the kernel version is a clean v2.6.27-rc5
(i.e. latest Linus -git):

BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
irq event stamp: 3585444
hardirqs last enabled at (3585443): [<c015927b>] trace_hardirqs_on+0xb/0x10
hardirqs last disabled at (3585444): [<c0290614>] trace_hardirqs_off_thunk+0xc/8
softirqs last enabled at (3585438): [<c0139fb1>] __do_softirq+0xe1/0x100
softirqs last disabled at (3585427): [<c013a075>] do_softirq+0xa5/0xb0

Pid: 0, comm: swapper Not tainted (2.6.27-rc5-00006-gbef69ea #3)
EIP: 0060:[<c011eb55>] EFLAGS: 00000202 CPU: 1
EIP is at native_safe_halt+0x5/0x10
EAX: 0036b5a3 EBX: c2161b80 ECX: 00000000 EDX: 00000000
ESI: 00000001 EDI: c090fb80 EBP: f7855f80 ESP: f7855f80
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 0807a0b0 CR3: 3695d000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c010ad80>] default_idle+0x50/0x70
[<c01029c8>] cpu_idle+0x68/0x130
[<c059e120>] start_secondary+0x160/0x1c0
=======================

c011eb50 <native_safe_halt>:
c011eb50: 55 push %ebp
c011eb51: 89 e5 mov %esp,%ebp
c011eb53: fb sti
c011eb54: f4 hlt
c011eb55: 5d pop %ebp
c011eb56: c3 ret
c011eb57: 89 f6 mov %esi,%esi
c011eb59: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi

...there's not many clues here as to what went wrong. Curious.

(Yes, I played with CPU hotplug and hibernate as well.)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/