Re: [PATCH v4] serial: sc16is7xx: address RX timeout interrupt errata

From: Hugo Villeneuve
Date: Wed Nov 22 2023 - 14:38:00 EST


On Wed, 22 Nov 2023 08:35:41 +0100
Daniel Mack <daniel@xxxxxxxxxx> wrote:

> This device has a silicon bug that makes it report a timeout interrupt
> but no data in the FIFO.
>
> The datasheet states the following in the errata section 18.1.4:
>
> "If the host reads the receive FIFO at the same time as a
> time-out interrupt condition happens, the host might read 0xCC
> (time-out) in the Interrupt Indication Register (IIR), but bit 0
> of the Line Status Register (LSR) is not set (means there is no
> data in the receive FIFO)."
>
> The errata doesn't explicitly mention that, but tests have shown
> and the vendor has confirmed that the RXLVL register is equally
> affected.

Hi Daniel,
thank you for the feedback from NXP.

I would suggest to replace this paragraph with something like this:

------
The errata description seems to indicate it affects only polled mode of
operation when reading bit 0 of the LSR register. But when using
interrupt mode (IRQ) like this driver does, reading RXLVL gives a value
of zero even if there is data in the Rx FIFO (confirmed by tests and
NXP).
------

> This bug has hit us on production units and when it does, sc16is7xx_irq()
> would spin forever because sc16is7xx_port_irq() keeps seeing an
> interrupt in the IIR register that is not cleared because the driver
> does not call into sc16is7xx_handle_rx() unless the RXLVL register
> reports at least one byte in the FIFO.
>
> Fix this by always reading one byte when this condition is detected

Change "reading one byte" to "reading one byte from the Rx FIFO".


> in order to clear the interrupt. This approach was confirmed to be
> correct by NXP through their support channels.
>
> Signed-off-by: Daniel Mack <daniel@xxxxxxxxxx>
> Co-Developed-by: Maxim Popov <maxim.snafu@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx

I tested your patch for the last few days, and I was not able to
reproduce the problem (I put a trace to detect the condition). But
at the same time, it has not caused any regressions.

With the above changes, feel free to add:

Tested by: Hugo Villeneuve <hvilleneuve@xxxxxxxxxxxx>

Hugo.


> ---
> Meanwhile, NXP has confirmed this fix to be correct.
>
> v4: NXP has confirmed the fix; update the commit log accordingly
> v3: re-added the additional Co-Developed-by and stable@ tags
> v2: reworded the commit log a bit for more context.
>
> drivers/tty/serial/sc16is7xx.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
> index 289ca7d4e566..76f76e510ed1 100644
> --- a/drivers/tty/serial/sc16is7xx.c
> +++ b/drivers/tty/serial/sc16is7xx.c
> @@ -765,6 +765,18 @@ static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
> case SC16IS7XX_IIR_RTOI_SRC:
> case SC16IS7XX_IIR_XOFFI_SRC:
> rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
> +
> + /*
> + * There is a silicon bug that makes the chip report a
> + * time-out interrupt but no data in the FIFO. This is
> + * described in errata section 18.1.4.
> + *
> + * When this happens, read one byte from the FIFO to
> + * clear the interrupt.
> + */
> + if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
> + rxlen = 1;
> +
> if (rxlen)
> sc16is7xx_handle_rx(port, rxlen, iir);
> break;
> --
> 2.41.0
>
>