Re: How to debug complete kernel lock-ups

From: Grant Grundler
Date: Thu Oct 25 2007 - 00:07:07 EST


On Wed, Oct 24, 2007 at 11:17:40AM +0200, John Sigler wrote:
...
> I've tested with a vanilla 2.6.22.10 kernel (no PREEMPT_RT patch).
> That system also locks up and remains completely unresponsive (I can't open
> new ssh sessions, the system won't answer ICMP echo requests).
>
> How do driver writers deal with complete kernel hangs?

Use different HW. Both IA64 and PARISC gives useful diagnostics
when the machine has a hard crash (MCA or HPMC respectively). I'll bet
PPC does too on the POWER machines.

Maybe a newer x86 machine can provide some MCE data as well?

Otherwise it's what gregkh said...not the "we slowly go crazy"
part. :) Well, sometimes. :)

BTW, getting PCI bus traces would be quite helpful in this case.
It'll give you clear data as to whether the devices are being programmed
as expected (also to rule out chipset/Host bus controller issues) and
whether they are responding as expected (maybe something else dies
when they do).

hth,
grant

hth,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/