[PATCH 4.14 124/193] mtd: nand: mtk: fix infinite ECC decode IRQ issue

From: Greg Kroah-Hartman
Date: Tue Nov 28 2017 - 06:14:50 EST


4.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xiaolei Li <xiaolei.li@xxxxxxxxxxxx>

commit 1d2fcdcf33339c7c8016243de0f7f31cf6845e8d upstream.

For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
during long time burn test on some platforms. Once this issue occurred,
the ECC decode IRQ status cannot be cleared in the IRQ handler function,
and threads cannot be scheduled.

ECC HW generates decode IRQ each sector, so there will have more than one
decode IRQ if read one page of large page NAND.

Currently, ECC IRQ handle flow is that we will check whether it is decode
IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
cleared at the same time.
Secondly, we will check whether all sectors are decoded by reading the
register ECC_DECDONE. This is because the current IRQ may be not dealed
in time, and the next sectors have been decoded before reading the
register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
be generated.
Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
next ECC IRQ.

But, there is a timing issue between step one and two. When we read the
reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
and the ECC IRQ signal is cleared. But the last sector is decoded before
reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
it means we will receive one extra ECC IRQ later. In step three, we will
find that all sectors were decoded, then disable ECC IRQ and return.
When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
anymore. That is because the register ECC_DECIRQ_STA can only be cleared
when the register ECC_IRQ_REG(op) is enabled. But actually we have
disabled ECC IRQ in the previous ECC IRQ handle. So, there will
keep receiving ECC decode IRQ.

Now, we read the register ECC_DECIRQ_STA once again before completing the
ecc done event. This ensures that there will be no extra ECC decode IRQ.

Also, remove writel(0, ecc->regs + ECC_IRQ_REG(op)) from irq handler,
because ECC IRQ is disabled in mtk_ecc_disable(). And clear ECC_DECIRQ_STA
in mtk_ecc_disable() in case there is a timeout to wait decode IRQ.

Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device")
Signed-off-by: Xiaolei Li <xiaolei.li@xxxxxxxxxxxx>
Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/mtd/nand/mtk_ecc.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq,
op = ECC_DECODE;
dec = readw(ecc->regs + ECC_DECDONE);
if (dec & ecc->sectors) {
+ /*
+ * Clear decode IRQ status once again to ensure that
+ * there will be no extra IRQ.
+ */
+ readw(ecc->regs + ECC_DECIRQ_STA);
ecc->sectors = 0;
complete(&ecc->done);
} else {
@@ -130,8 +135,6 @@ static irqreturn_t mtk_ecc_irq(int irq,
}
}

- writel(0, ecc->regs + ECC_IRQ_REG(op));
-
return IRQ_HANDLED;
}

@@ -307,6 +310,12 @@ void mtk_ecc_disable(struct mtk_ecc *ecc

/* disable it */
mtk_ecc_wait_idle(ecc, op);
+ if (op == ECC_DECODE)
+ /*
+ * Clear decode IRQ status in case there is a timeout to wait
+ * decode IRQ.
+ */
+ readw(ecc->regs + ECC_DECIRQ_STA);
writew(0, ecc->regs + ECC_IRQ_REG(op));
writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));