Re: NMI Watchdog detected LOCKUP on CPU1 (stext_lock)(2.4.0-test9-pre2)

From: Keith Owens (kaos@ocs.com.au)
Date: Wed Sep 20 2000 - 08:24:36 EST


On Thu, 21 Sep 2000 00:05:03 +1100,
Andrew Morton <andrewm@uow.edu.au> wrote:
>> The downside of this optimization is that all code that is waiting for
>> a lock appears to be in the out of line section and the only label in
>> that section is right at the start. So all lock code appears to be in
>> stext_lock. What really matters is where the stext_lock code jumps
>> back to, that tells you which code is waiting for the lock. In this
>> case you have jumps back to blk_get_queue+9/60 so it is waiting on
>> io_request_lock. Now all you have to do is work out who is holding
>> onto io_request_lock.
>
>What do you think of this addition to the x86 debug code, Keith:
>
>If you get an NMI oops it says:
>
> NMI Watchdog detected LOCKUP on CPU1, registers:
> CPU: 1
> EIP: 0010:[<c01fe212>]
> EFLAGS: 00000086
> Waiting on spinlock! Spinner's EIP is [<c0130d7a>]
> ...

Is the extra code worth it? The ix86 oops dump runs the stack printing
anything that looks like a kernel address. This means that we "always"
get the function that wants the lock, we just do not get the precise
address of the call to spinlock(). All right, we do not always get the
current function, sometimes we only get the function that called the
function that call spinlock(). But even that seems to be enough for
oops.

kdb has to go to all the bother of decoding the .text.lock sections to
get an accurate backtrace. Oops decoding has never pretended to be
accurate, it gets lots of false positives.

BTW, you made the same mistake I did with .text_lock. The kernel is
not the only place where section .text.lock exists, each module has its
own .text.lock section. That is one of the reasons that the kdb symbol
table processing needs so much data, to extract sections from modules
and find symbols for return addresses from all the strange out of line
code - man kallsyms for the gory details.

>Waiting on spinlock! Spinner's EIP is [<c0130d7a>]
>Will ksymoops automatically parse the EIP?

Not as it stands, but yet another regular expression is no big deal.
IMHO the existing oops gives enough data, given the known restrictions
of stack backtraces without full unwind support.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:23 EST