Re: Linux 2.6.33-rc1

From: Borislav Petkov
Date: Sun Dec 20 2009 - 14:14:53 EST


On Sun, Dec 20, 2009 at 06:53:24PM +0100, Torsten Kaiser wrote:
> On Sat, Dec 19, 2009 at 8:54 PM, Torsten Kaiser
> <just.for.lkml@xxxxxxxxxxxxxx> wrote:
> > [ Â Â5.061998] EDAC MC: Ver: 2.1.0 Dec 18 2009
> > [ Â Â5.062186] EDAC amd64_edac: ÂVer: 3.3.0 Dec 18 2009
> > [ Â Â5.062235] EDAC amd64: ECC is enabled by BIOS.
> > [ Â Â5.062297] EDAC amd64: ECC is enabled by BIOS.
> > [ Â Â5.128332] EDAC MC: Rev F or later detected
> > [ Â Â5.134186] EDAC amd64: amd64_read_mc_registers: error reading F2x190.
> > [ Â Â5.142290] EDAC amd64: amd64_read_mc_registers: error reading F2x194.
> > [ Â Â5.150355] EDAC MC: DCT0 chip selects:
> > [ Â Â5.150357] EDAC MC: Â0: Â 512MB 1: Â 512MB
> > [ Â Â5.150358] EDAC MC: Â2: Â Â 0MB 3: Â Â 0MB
> > [ Â Â5.150361] EDAC MC: Â4: Â Â 0MB 5: Â Â 0MB
> > [ Â Â5.150362] EDAC MC: Â6: Â Â 0MB 7: Â Â 0MB
> > [ Â Â5.150519] EDAC MC0: Giving out device to 'amd64_edac' 'RevF': DEV
> > 0000:00:18.2
> > [ Â Â5.150522] EDAC MC: Rev F or later detected
> > [ Â Â5.150530] EDAC amd64: amd64_read_mc_registers: error reading F2x190.
> > [ Â Â5.150532] EDAC amd64: amd64_read_mc_registers: error reading F2x194.
> > [ Â Â5.150533] EDAC MC: DCT0 chip selects:
> > [ Â Â5.150535] EDAC MC: Â0: Â 512MB 1: Â 512MB
> > [ Â Â5.150536] EDAC MC: Â2: Â Â 0MB 3: Â Â 0MB
> > [ Â Â5.150537] EDAC MC: Â4: Â Â 0MB 5: Â Â 0MB
> > [ Â Â5.150539] EDAC MC: Â6: Â Â 0MB 7: Â Â 0MB
> > [ Â Â5.150664] EDAC MC1: Giving out device to 'amd64_edac' 'RevF': DEV
> > 0000:00:19.2
> > [ Â Â5.150742] EDAC PCI0: Giving out device to module 'amd64_edac'
> > controller 'EDAC PCI
> > Âcontroller': DEV '0000:00:18.2' (POLLED)
> >
> > The system has 4x 1GB RAM sticks (2 on each CPU).

What are those DIMMs: single or dual ranked? Can you give me the exact
model name?

> After reading the code in drivers/edac/amd64_edac.c and the
> documentation in the AMD reference doc (#32559, I have Rev. 3.08) the
> bug is, that the current code does not try to differentiate between
> the 64bit and the 128bit mode.
> In the doc the sizes for the 64bit mode in table 10, section 4.5.8.1
> are identical to the table ddr2_dbam in amd64_edac.c.
> But for the 128bit mode the table 11 should be used, there the sizes
> are doubled.
>
> The code uses the bit 11 (named F10_WIDTH_128 in amd64_edac.h) of the
> lower DRAM configuration register to determine the number of channels
> in k8_early_channel_count(), but this is not used in
> amd64_debug_display_dimm_sizes()

That might be the case, can you enable CONFIG_EDAC_DEBUG and
CONFIG_EDAC_DEBUG_VERBOSE and rebuild your kernel, please? Then, send me
the _whole_ dmesg output. If the output appears truncated, try enlarging
the log buffer size by setting log_buf_len on the kernel command line to
something large, i.e. 'log_buf_len=10M'.

>
> > And there is no line like 'EDAC PCI0' for the DRAM controller of the
> > second CPU (19.2). Is that normal?
>
> amd64_edac_init() calls amd64_init_2nd_stage() for each northbrigde,
> but amd64_setup_pci_device() only once.
>
> But from looking at the code, I can't see if a second device is needed or not.

No, its not since it seems like the EDAC PCI code scans all known PCI
devices anyways.

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/