Re: radeon KMS causes GART Table Walk Errors (was: K8 ECC errorwith linux-2.6.32)

From: Borislav Petkov
Date: Fri Dec 18 2009 - 06:56:44 EST


On Thu, Dec 17, 2009 at 08:03:29PM +0100, Johannes Hirte wrote:
> GART Error Reporting was disabled. Here is the output after enabling it:
>
> datengrab ~ # lsmsr MC4 -V3
> MC4_CTL = 0x0000000000003bff
> CorrEccEn=0x1
> UnCorrEccEn=0x1
> CrcErr0En=0x1
> CrcErr1En=0x1
> CrcErr2En=0x1
> SyncPkt0En=0x1
> SyncPkt1En=0x1
> SyncPkt2En=0x1
> MstrAbrtEn=0x1
> TgtAbrtEn=0x1
> GartTblWkEn=0
> AtomicRMWEn=0x1
> WchDogTmrEn=0x1
> DramParEn=0

[.. ]

> MC4_CTL_MASK = 0x0000000000000000
> CorrEccEn=0
> UnCorrEccEn=0
> CrcErr0En=0
> CrcErr1En=0
> CrcErr2En=0
> SyncPkt0En=0
> SyncPkt1En=0
> SyncPkt2En=0
> MstrAbrtEn=0
> TgtAbrtEn=0
> GartTblWkEn=0

Ok, thanks for testing. It looks like your BIOS is applying the wrong
workaround when the option is enabled. It clears MC4_CTL[GartTblWkEn],
which means, it disables reporting of GART table walk errors while
they're still being logged by the hw. What it should do is to set
MC4_CTL_MASK[GartTblWkEn] to 1 so that logging gets disabled, as it is
recommended in the K8 BKDG.

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
Research | Geschäftsführer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
(OSRC) | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/