Athlon64 + Nforce4 MCE panic

From: Rumi Szabolcs
Date: Thu Jul 13 2006 - 02:49:36 EST


Hello!

Tonight I had a kernel panic (full hang) with the following on the console:

CPU 0: Machine Check Exception: 0000000000000004
Bank 4: b200000000070f0f
Kernel panic - not syncing: CPU context corrupt

I tried to decode this:

# ./parsemce -e 0000000000000004 -b 4 -s b200000000070f0f -a 0
Status: (4) Machine Check in progress.
Restart IP invalid.
parsebank(4): b200000000070f0f @ 0
External tag parity error
CPU state corrupt. Restart not possible
Error enabled in control register
Error not corrected.
Bus and interconnect error
Participation: Generic
Timeout:
Request: Generic error
Transaction type : Invalid
Memory/IO : Other

# echo 'CPU 0: Machine Check Exception: 0000000000000004 Bank 4: b200000000070f0f' | mcelog --ascii --k8
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 4 northbridge Northbridge Watchdog error
bit57 = processor context corrupt
bit61 = error uncorrected
bus error 'generic participation, request timed out
generic error mem transaction
generic access, level generic'
STATUS b200000000070f0f MCGSTATUS 4

I've been searching for and found some additional info, it
looks like I'm not the first experiencing this problem:

http://kerneltrap.org/node/4993

Here ^^^ it is suggested that there was some thread about A64 MCEs
on LKML but I failed to find it so I decided to post... sorry if
it's redundant.

The kernel is 2.6.16-gentoo-r9 and the CPU is an A64 "Venice" 3500+
I can post further hw/sw environment information if req'd.

Thanks!

Regards,

Sab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/