Re: Machine Check Exception on Opteron 265

From: Joachim Deguara
Date: Mon Apr 16 2007 - 02:32:34 EST


On Saturday 14 April 2007 17:39:28 Robert Hancock wrote:
> Espen Fjellvær Olsen wrote:
> > As far as we know there wasnt any unuasal activity on the server at the
> > time.
> > We updated glibc yesterday, but that shouldnt really cause such a
> > problem. So now we wonder if this might be an MCE bug, or really a HW
> > problem, and if it is one of the CPUs, or the RAM thats faulty.
> > We are running 2.6.18.
>
> Sounds like some bad RAM..

Clearly. I would run memtest-86+[1] for a night and you should see the bad
DIMM. You can try the new feature in version 1.70 of that tool to display
the DMI name of the DIMM to attempt to locate the exact DIMM. That option is
under Error Reporting menu.

-Joachim

[1] http://memtest.org/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/