Re: [PATCH] uart: mediatek: fix memory corruption issue

From: Greg KH
Date: Thu Jul 22 2021 - 06:12:42 EST


On Thu, Jul 22, 2021 at 06:03:32PM +0800, zhiyong tao wrote:
> On Wed, 2021-07-21 at 12:46 +0200, Greg KH wrote:
> > On Sat, Jul 10, 2021 at 05:01:03PM +0800, Zhiyong Tao wrote:
> > > This patch is used to fix memory corruption issue when rx power off.
> > > 1. add spin lock in mtk8250_dma_rx_complete function in APDMA mode.
> >
> > What does a lock protect from? Please be explicit and detailed.
>
> ==> Hi Gregkh,
>
> when uart is used as a communication port with external device(GPS).
> when external device(GPS) power off, the power of rx pin is also from
> 1.8v to 0v. Even if there is not any data in rx. But uart rx pin can
> capture the data "0".

That sounds like a broken hardware design.

> If uart don't receive any data in specified cycle, uart will generates
> BI(Break interrupt) interrupt.
> If external device(GPS) power off, we found that BI interrupt appeared
> continuously and very frequently.
> When uart interrupt type is BI, uart IRQ handler(8250 framwork
> API:serial8250_handle_irq) will push data to tty buffer.
> The code path:
> https://elixir.bootlin.com/linux/latest/source/drivers/tty/serial/8250/8250_port.c#L1917
>
> mtk8250_dma_rx_complete is a task of mtk_uart_apdma_rx_handler.
> mtk8250_dma_rx_complete priority is lower than uart irq
> handler(serial8250_handle_irq).
> if we are in process of mtk8250_dma_rx_complete, uart appear BI
> interrupt:1)serial8250_handle_irq will priority execution.2)it may cause
> write tty buffer conflict in mtk8250_dma_rx_complete.
> So the spin lock protect the rx receive data process is not break.

Then put something like this in the changelog text, as it is, it is not
descriptive at all.

> > > 2. add processing mechanism which count value is 0
> >
> > What does this do? And why is it needed?
>
> ==> when count value is 0, we don't need push data to tty buffer.
> so we add it.

But that does not actually do anything different from what it does
today. And it has nothing to do with the lock, so this should be 2
different patches, right?

> > > Signed-off-by: Zhiyong Tao <zhiyong.tao@xxxxxxxxxxxx>
> >
> > What commit does this fix? Does this need to go to stable kernel trees?
> > If so, how far back?
> >
> > > ---
> > > drivers/tty/serial/8250/8250_mtk.c | 15 +++++++++++----
> > > 1 file changed, 11 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c
> > > index f7d3023f860f..09f7d2166315 100644
> > > --- a/drivers/tty/serial/8250/8250_mtk.c
> > > +++ b/drivers/tty/serial/8250/8250_mtk.c
> > > @@ -91,12 +91,15 @@ static void mtk8250_dma_rx_complete(void *param)
> > > struct mtk8250_data *data = up->port.private_data;
> > > struct tty_port *tty_port = &up->port.state->port;
> > > struct dma_tx_state state;
> > > - int copied, total, cnt;
> > > + unsigned int copied, total, cnt;
> > > unsigned char *ptr;
> > > + unsigned long flags;
> > >
> > > if (data->rx_status == DMA_RX_SHUTDOWN)
> > > return;
> > >
> > > + spin_lock_irqsave(&up->port.lock, flags);
> > > +
> > > dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> > > total = dma->rx_size - state.residue;
> > > cnt = total;
> > > @@ -104,9 +107,11 @@ static void mtk8250_dma_rx_complete(void *param)
> > > if ((data->rx_pos + cnt) > dma->rx_size)
> > > cnt = dma->rx_size - data->rx_pos;
> > >
> > > - ptr = (unsigned char *)(data->rx_pos + dma->rx_buf);
> > > - copied = tty_insert_flip_string(tty_port, ptr, cnt);
> > > - data->rx_pos += cnt;
> > > + if (cnt != 0) {
> >
> > Why does cnt matter here? If cnt is 0, the code above should not do
> > anything at all, right?
>
> ==> yes, if the counter value is 0, we don't need push data to the tty
> buffer.

But this does not change the logic as if cnt is 0, nothing gets pushed
with the current code either, right?

> > Or if it does, should we change tty_insert_flip_string() to always check
> > for cnt != 0 before it does the first loop? Hm, it looks like it will
> > abort if cnt is 0, so what is this change really doing? Why do you need
> > it? What is it "fixing"?
> >
> ==> It is not fix anything, we just think if count value is 0, we don't
> need do anything.

Then make it a separate patch, independant from the lock patch, and we
can discuss it there. Do NOT have patches do multiple things.

thanks,

greg k-h