RE: [PATCH] EDAC/i10nm: shift exponent is negative

From: Zhuo, Qiuxu
Date: Thu Jun 29 2023 - 06:03:45 EST


Hi Ko,

I don't agree with simply skipping over a DIMM even EDAC doesn't expect to see it.
As the EDAC driver can still report errors for this DIMM once there are errors that occur in this DIMM.

As per Tony's suggestion, could you test your kernel with CONFIG_EDAC_DEBUG=y and see the result?

@Luck, Tony, Perhaps we may turn the debug print

edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);

to an error-print explicitly

skx_printk(KERN_ERR, "bad %s = %d (raw=0x%x)\n", name, val, reg);

Let the user have the chance to notice there is a DIMM that EDAC doesn't expect to see.

- Qiuxu

> From: Koba Ko <koba.ko@xxxxxxxxxxxxx>
> Sent: Thursday, June 29, 2023 11:53 AM
> To: Luck, Tony <tony.luck@xxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>; James Morse <james.morse@xxxxxxx>;
> Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>; Robert Richter
> <rric@xxxxxxxxxx>; linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] EDAC/i10nm: shift exponent is negative
>
> hi Luck,
> I agree with your points
> is it expected to shift with negative?
>
> Thanks
> Koba Ko
>
> On Thu, Jun 29, 2023 at 12:41 AM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
> >
> > > ranks = numrank(mtr);
> > > rows = numrow(mtr);
> > > cols = imc->hbm_mc ? 6 : numcol(mtr);
> > > + if (ranks == -EINVAL || rows == -EINVAL || cols == -EINVAL)
> > > + return 0;
> >
> > This seems to be just hiding the real problem that a DIMM was found
> > with some number of ranks, rows, or columns that the EDAC driver
> > didn't expect to see. Your fix makes the driver skip over this DIMM.
> >
> > Can you build your kernel with CONFIG_EDAC_DEBUG=y and see what
> > messages you get from this code:
> >
> > static int skx_get_dimm_attr(u32 reg, int lobit, int hibit, int add,
> > int minval, int maxval, const char *name)
> > {
> > u32 val = GET_BITFIELD(reg, lobit, hibit);
> >
> > if (val < minval || val > maxval) {
> > edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);
> > return -EINVAL;
> > }
> >
> > -Tony
> >
> >