Re: Pls help me understand this MCE

From: David King
Date: Thu Aug 11 2005 - 13:02:44 EST


Petr Vandrovec wrote:
> Try dumping *all* MCE values, as well as a call stack. Even although
> MCE is tagged as processor context corrupt, there is rather big chance
> that stack trace will point back to the instruction which caused MCE
> (it always did in my case), especially if it is single processor system.
> Then you'll at least know which subsystem/driver did that.

Ok, here's everything I got from the serial console when the error
occurred. I don't have a clue how to interpret this stuff so I'd be
eternally grateful if someone out there can help. Or, if I
misunderstood what you were telling me I ought to do, then explaining
the process a bit more would be appreciated too.

CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f
TSC 7cba18189a
Kernel panic - not syncing: Machine check

Call Trace: <#MC> <ffffffff8013a4b5>{panic+133}
<ffffffff80116d48>{print_mce+136}
<ffffffff80116e19>{mce_panic+137} <ffffffff801173f2>{do_machine_check+754}
<ffffffff80110147>{machine_check+127}
<ffffffff80113dec>{timer_interrupt+444}
<EOE> <IRQ> <ffffffff80146b50>{process_timeout+0}
<ffffffff801704dc>{handle_IRQ_event+44} <ffffffff801706ed>{__do_IRQ+477}
<ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0}
<EOI> <ffffffff8010d230>{default_idle+0} <ffffffff8010d252>{default_idle+34}
<ffffffff8010d291>{cpu_idle+49} <ffffffff8057e7e5>{start_kernel+469}
<ffffffff8057e1f4>{_sinittext+500}

Thanks
--
David King
dave@xxxxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/