debugging solid SMP lockups [was: Re: Solid lockup on 2.1.78]

MOLNAR Ingo (mingo@chiara.csoma.elte.hu)
Thu, 15 Jan 1998 14:21:25 +0100 (CET)


On Thu, 15 Jan 1998, T Taneli Vahakangas wrote:

> I'm reporting this, but I don't believe this is of any help. The scenario:
> I was using X with moderate load when the machine locked. No, I don't
> think it was because of X, since Magic-SysRq didn't work.

this is not a solution for your problem, but a related issue: if anyone
experiences reproducible lockups on SMP systems, there is an easy way to
get it debugged without additional hardware: apply the io-apic patch, make
the keyboard IRQ an NMI irq (one line change), then go reproduce the
lockup and use SysReq to get a trace ... One should not use the keyboard
during reproducing this.

if the keyboard should be usable, then another solution is to get an oops
when one pushes the 'Turbo' switch. I can send patches that make the timer
IRQ an NMI one in a safe way (easy), then the NMI handler detects CPU
speed. If CPU speed magically drops (due to pressing the 'Turbo' switch),
the timer IRQ will generate an oops message by sending a special broadcast
NMI to all CPUs.

so anyone with 2.1 SMP lockups, speak up, the tools are there ...

-- mingo