Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

From: Dave Jones
Date: Fri Aug 04 2006 - 20:29:54 EST


On Fri, Aug 04, 2006 at 06:24:00PM -0400, Dave Jones wrote:
> > DaveJ, any ideas?
>
> So I'm at a loss to explain why this made this bug suddenly appear, but
> I don't think this is anything new. Fixing it however...
> I'm looking at it, but I don't see anything obvious yet.

Ok, debugging this is getting ridiculous. I dug out a core duo,
built a kernel and watched it spontaneously reboot before it even got to init.
Digging out a serial cable yielded this gem:


Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1020104k/1047108k available (2082k kernel code, 26324k reserved, 1107k data, 240k init, 129604k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.

=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
swapper/0 is trying to acquire lock:
(cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

but task is already holding lock:
(cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

other info that might help us debug this:
1 lock held by swapper/0:
#0: (cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

stack backtrace:
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013ae05>] __lock_acquire+0x74f/0x96d
[<c013b566>] lock_acquire+0x4b/0x6d
[<c0304d35>] __mutex_lock_slowpath+0xb0/0x1f6
[<c0304e9c>] mutex_lock+0x21/0x24
[<c013e8d4>] lock_cpu_hotplug+0x4a/0x4c
[<c016bed3>] kmem_cache_create+0x61/0x577
[<c04af5a1>] kmem_cache_init+0x177/0x38c
[<c049c5f3>] start_kernel+0x221/0x3b3
[<c0100210>] 0xc0100210
DWARF2 unwinder stuck at 0xc0100210
Leftover inexact backtrace:
=======================
=======================
=======================
bad: scheduling from the idle thread!
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c03035a6>] schedule+0xa6/0x9d3
[<c0304d95>] __mutex_lock_slowpath+0x110/0x1f6
[<c0304e9c>] mutex_lock+0x21/0x24
[<c013e8d4>] lock_cpu_hotplug+0x4a/0x4c
[<c016bed3>] kmem_cache_create+0x61/0x577
[<c04af5a1>] kmem_cache_init+0x177/0x38c
[<c049c5f3>] start_kernel+0x221/0x3b3
[<c0100210>] 0xc0100210
DWARF2 unwinder stuck at 0xc0100210
Leftover inexact backtrace:
=======================
=======================
=======================

<Insert reboot here>

Waaaay before cpufreq even enters the picture.

Sigh.

Dave

--
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/