Re: Reading a bad sector does not report failure as 'read error' buthangs PC with 'Machine Check Exception'

From: Robert Hancock
Date: Wed Aug 01 2007 - 05:45:28 EST


Hendrik . wrote:
Ok, I did actually not copy the coreret code in the
mcelog, leaving me some errors about the Northbridge.
If I do it again it gives me something else. I made 2
digital photo's of 2 lockups when it happened and this
is the result of the tool, the TSC is different in
both errors, the rest is the same:

--------------------
CPU 0 4 northbridge TSC b7d4a144d0 Northbridge Watchdog error
bit57 = processor context corrupt
bit61 = error uncorrected
bus error 'generic participation, request timed out
generic error mem transaction
generic access, level generic'
STATUS b200000000070f0f MCGSTATUS 4
This is not a software problem!

Presumably some access that the CPU is doing to the controller has timed out and caused the MCE. It might be useful if we could get a stack trace from where the MCE was triggered - does anyone know if it's possible to do this?

It's possible that only NVidia really could tell why this error would result from a disk problem, though.

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@xxxxxxxxxxxxx
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/