çå: çå: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open

From: Li,Rongqing
Date: Wed Jan 30 2019 - 21:15:44 EST




> -----éäåä-----
> åää: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
> åéæé: 2019å1æ30æ 21:17
> æää: Li,Rongqing <lirongqing@xxxxxxxxx>
> æé: jslaby@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; gkohli@xxxxxxxxxxxxxx;
> linux-serial@xxxxxxxxxxxxxxx
> äé: Re: çå: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
>
> On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote:
> >
> >
> > > -----éäåä-----
> > > åää: linux-kernel-owner@xxxxxxxxxxxxxxx
> > > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] äè Greg KH
> > > åéæé: 2019å1æ30æ 18:19
> > > æää: Li,Rongqing <lirongqing@xxxxxxxxx>
> > > æé: jslaby@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > gkohli@xxxxxxxxxxxxxx
> > > äé: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and
> > > tty_open
> > >
> > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > > > There still is a race window after the commit b027e2298bd588
> > > > ("tty: fix data race between tty_init_dev and flush of buf"), and
> > > > we encountered this crash issue if receive_buf call comes before
> > > > tty initialization completes in n_tty_open and
> > > > tty->driver_data may be NULL.
> > > >
> > > > CPU0 CPU1
> > > > ---- ----
> > > > n_tty_open
> > > > tty_init_dev
> > > > tty_ldisc_unlock
> > > > schedule flush_to_ldisc
> > > > receive_buf
> > > > tty_port_default_receive_buf
> > > > tty_ldisc_receive_buf
> > > > n_tty_receive_buf_common
> > > > __receive_buf
> > > > uart_flush_chars
> > > > uart_start
> > > > /*tty->driver_data is NULL*/
> > > > tty->ops->open
> > > > /*init tty->driver_data*/
> > > >
> > > > it can be fixed by extending ldisc semaphore lock in tty_init_dev
> > > > to driver_data initialized completely after tty->ops->open(), but
> > > > this will lead to put lock on one function and unlock in some
> > > > other function, and hard to maintain, so fix this race only by
> > > > checking
> > > > tty->driver_data when receiving, and return if tty->driver_data
> > > > is NULL
> > > >
> > > > Signed-off-by: Wang Li <wangli39@xxxxxxxxx>
> > > > Signed-off-by: Zhang Yu <zhangyu31@xxxxxxxxx>
> > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
> > > > ---
> > > > V4: add version information
> > > > V3: not used ldisc semaphore lock, only checking tty->driver_data
> > > > with NULL
> > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > > > V1: extend ldisc lock to protect that tty->driver_data is inited
> > > >
> > > > drivers/tty/tty_port.c | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > > > 044c3cbdcfa4..86d0bec38322 100644
> > > > --- a/drivers/tty/tty_port.c
> > > > +++ b/drivers/tty/tty_port.c
> > > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct
> > > > tty_port
> > > *port,
> > > > if (!tty)
> > > > return 0;
> > > >
> > > > + if (!tty->driver_data)
> > > > + return 0;
> > > > +
> > >
> > > How is this working? What is setting driver_data to NULL to "stop" this
> race?
> > >
> >
> >
> > if tty->driver_data is NULL and return, tty_port_default_receive_buf
> > will not step to uart_start which access tty->driver_data and trigger
> > panic before tty_open, so it can fix the system panic
> >
> > > There's no requirement that a tty driver set this field to NULL when it is
> "done"
> > > with the tty device, so I think you are just getting lucky in that
> > > your specific driver happens to be doing this.
> > >
> >
> > when tty_open is running, tty is allocated by kzalloc in tty_init_dev
> > which called by tty_open_by_driver, tty is inited to 0
> >
> > > What driver are you testing this against?
> > >
> >
> > 8250
>
> Ok, as this is specific to the uart core, how about this patch instead:
>
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 5c01bb6d1c24..b56a6250df3f 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty)
> struct uart_port *port;
> unsigned long flags;
>
> + if (!state)
> + return;
> +
> port = uart_port_lock(state, flags);
> __uart_start(tty);
> uart_port_unlock(port, flags);


If move the check into uart_start, i am afraid that it maybe not fully fix this issue,
Since n_tty_receive_buf_common maybe call n_tty_check_throttle/
tty_unthrottle_safe which maybe use the tty->driver_data

if tty is not fully opened, I think no gain to step into more function

thanks

-RongQing