Re: [RFC PATCH] tty/serial: atmel: fix TX path in atmel_console_write()

From: Nicolas Ferre
Date: Fri Mar 17 2017 - 13:26:12 EST


Le 17/03/2017 Ã 16:11, Richard Genoud a Ãcrit :
> 2017-03-15 17:56 GMT+01:00 Nicolas Ferre <nicolas.ferre@xxxxxxxxxxxxx>:
>> Le 15/03/2017 Ã 17:19, Richard Genoud a Ãcrit :
>>> On 15/03/2017 16:29, Nicolas Ferre wrote:
>>>> A side effect of 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA
>>>> from transmitting in stop_tx") is that the console can be called with
>>>> TX path disabled. Then the system would hang trying to push charecters
>>>> out in atmel_console_putchar().
>>>>
>>>> Signed-off-by: Nicolas Ferre <nicolas.ferre@xxxxxxxxxxxxx>
>>>> Fixes: 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA from transmitting
>>>> in stop_tx")
>>>> Cc: stable <stable@xxxxxxxxxxxxxxx> # 4.4+
>>>> ---
>>>> Hi Richard,
>>>>
>>>> I found this to fix the problem with system hang in my linux-4.4-at91 branch
>>>> (in the atmel_console_putchar() waiting loop actually). I'm open to more
>>>> insignt.
>>>> As we cannot figure out if this bit is set or not, I didn't preserve the
>>>> current status...
>>>>
>>>> Regards,
>>>
>>> So, I'm guessing that you may see some lines/characters printed twice on
>>> the screen, don't you ?
>>
>> Well, actually, I don't think so because the repetitions that I see are
>> probably due to open/close/open/close/re-open/... of the serial console
>> itself.
>>
>> Same with the line "random: udevd: uninitialized urandom read (16 bytes
>> read, 21 bits of entropy available)", they happen at different moment in
>> time => the printk log timestamping seem to indicate that they are
>> different.
> Hi Nicolas,
>
> It seems that the problem is between atmel_tx_dma() and its callback
> atmel_complete_tx_dma().
>
> At some point, atmel_tx_dma() is called, does the job, and then, just
> before the callback is called, the xmit->head and xmit->tail pointers
> are set to zero (by uart_flush_buffer())
> So, when atmel_complete_tx_dma() is called, it does:
> xmit->tail += atmel_port->tx_len;
> not knowing that the head and tail pointers have been reseted.
> => it's like there's (UART_XMIT_SIZE - atmel_port->tx_len) characters to
> transmit on the serial line.
>
> PS: I can trigger this bug by holding down the d key at login and then
> ctrl - basically, a ctrl-d just after sending text - with a rate success
> of about 1/5 :)
>
> Could you try this patch to see if it corrects also your system hang ?

Just tried, it doesn't fix the system hang.

But it seems to solve the issue that I had while halting the system: a
kind of flush of a previous buffer in the console.
So, I think it does solve something.

So, with both my patch and yours:
Tested-by: Nicolas Ferre <nicolas.ferre@xxxxxxxxxxxxx>

Regards,

> (The patch is small, but the bug hunt was a headache :))
>
> [PATCH] tty/serial: atmel: fix race condition (TX+DMA)
>
> If uart_flush_buffer() is called between atmel_tx_dma() and
> atmel_complete_tx_dma(), the circular buffer has been cleared, but not
> atmel_port->tx_len.
> That leads to a circular buffer overflow (dumping (UART_XMIT_SIZE -
> atmel_port->tx_len) bytes).
>
> Signed-off-by: Richard Genoud <richard.genoud@xxxxxxxxx>
> ---
> drivers/tty/serial/atmel_serial.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/tty/serial/atmel_serial.c
> b/drivers/tty/serial/atmel_serial.c
> index 833d3d80446f..89552157e334 100644
> --- a/drivers/tty/serial/atmel_serial.c
> +++ b/drivers/tty/serial/atmel_serial.c
> @@ -1934,6 +1934,11 @@ static void atmel_flush_buffer(struct uart_port
> *port)
> atmel_uart_writel(port, ATMEL_PDC_TCR, 0);
> atmel_port->pdc_tx.ofs = 0;
> }
> + /*
> + * in uart_flush_buffer(), the xmit circular buffer has just
> + * been cleared, so we have to reset tx_len accordingly.
> + */
> + atmel_port->tx_len = 0;
> }
>
> /*
>


--
Nicolas Ferre