RE: [PATCH] EDAC/i10nm: shift exponent is negative

From: Zhuo, Qiuxu
Date: Fri Jun 30 2023 - 04:31:36 EST


> From: Luck, Tony <tony.luck@xxxxxxxxx>
> Sent: Friday, June 30, 2023 12:12 AM
> To: Zhuo, Qiuxu <qiuxu.zhuo@xxxxxxxxx>; Koba Ko <koba.ko@xxxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>; James Morse <james.morse@xxxxxxx>;
> Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>; Robert Richter
> <rric@xxxxxxxxxx>; linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH] EDAC/i10nm: shift exponent is negative
>
> > I don't agree with simply skipping over a DIMM even EDAC doesn't expect
> to see it.
> > As the EDAC driver can still report errors for this DIMM once there are
> errors that occur in this DIMM.
> >
> > As per Tony's suggestion, could you test your kernel with
> CONFIG_EDAC_DEBUG=y and see the result?
> >
> > @Luck, Tony, Perhaps we may turn the debug print
> >
> > edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);
> >
> > to an error-print explicitly
> >
> > skx_printk(KERN_ERR, "bad %s = %d (raw=0x%x)\n", name, val, reg);
> >
> > Let the user have the chance to notice there is a DIMM that EDAC doesn't
> expect to see.
>
> We need both. Changing that debug message to a real error message will let
> the user know that EDAC doesn't recognize this DIMM (and will provide the
> information for you or me to fix the driver).
>
> But we also need Ko's fix - because it makes no sense to just use that
> negative shift and pretend that EDAC knows how to handle this DIMM.
>

OK.
@Koba Ko, could you make a new patch with Tony's suggestion? Thanks!

-Qiuxu