RE: [PATCH 1/1] EDAC/igen6: Fix the issue of no error events

From: Zhuo, Qiuxu
Date: Tue Jul 25 2023 - 22:32:00 EST


> From: Luck, Tony <tony.luck@xxxxxxxxx>
> ...
> Subject: RE: [PATCH 1/1] EDAC/igen6: Fix the issue of no error events
>
> > Fix this issue by moving the pending error handler after the
> > registration of the error handler, ensuring that no pending errors are left
> unhandled.
>
> Do you think drivers/edac/e7xxx_edac.c has the same issue?

Hi Tony,

Based on the code [1], the e7xxx_edac works in polling mode and the pending
errors can be handled within one period after the error handler registration.

So, I don't think the e7xxx_edac has the same issue.

[1] e7xxx_probe1()-> mci->edac_check = e7xxx_check;
-> edac_mc_add_mc() -> if (mci->edac_check) { mci->op_state = OP_RUNNING_POLL; ... }

> 491
> 492 /* clear any pending errors, or initial state bits */
> 493 e7xxx_get_error_info(mci, &discard);

This function is also invoked periodically in polling mode:
mci->edac_check() -> e7xxx_check() -> e7xxx_get_error_info()

> 494
> 495 /* Here we assume that we will never see multiple instances of this
> 496 * type of memory controller. The ID is therefore hardcoded to 0.
> 497 */
> 498 if (edac_mc_add_mc(mci)) {
> 499 edac_dbg(3, "failed edac_mc_add_mc()\n");
> 500 goto fail1;
> 501 }
> 502
> 503 /* allocating generic PCI control info */
> 504 e7xxx_pci = edac_pci_create_generic_ctl(&pdev->dev,
> EDAC_MOD_STR);
>
> Though it might be hard to find such an old system to test.
>
> -Tony