Help interpreting Machine Check Exception

From: Andrew Walrond
Date: Fri Jan 14 2005 - 06:32:26 EST


This is a dual opteron numa machine and I am seeing messages like this during
boot:

CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f
TSC b961d94950
kernel panic - not syncing: Machine check

Which when run through Dave Jones tool says

andrew@orac test $ ./a.out -b 4 -s b200000000070f0f -e 4 -a 0
Status: (4) Machine Check in progress.
Restart IP invalid.
parsebank(4): b200000000070f0f @ 0
External tag parity error
CPU state corrupt. Restart not possible
Error enabled in control register
Error not corrected.
Bus and interconnect error
Participation: Generic
Timeout:
Request: Generic error
Transaction type : Invalid
Memory/IO : Other


Any clues on what might be broken? does "Bus and interconnect error" suggest
the MB?

I have reseating all memory and cpus, and run memtest overnight without error.

Andrew Walrond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/