Re: Q: tty_open() && hangup

From: Peter Hurley
Date: Fri Apr 15 2016 - 15:29:21 EST


Hi Oleg,

On 04/15/2016 10:19 AM, Oleg Nesterov wrote:
> Hi,
>
> I am fingting with obscure pty bug in rhel6 which _might be_ explained
> by some hangup/reopen races, and this motivated me to look at upstream
> code which I can't understand too ;)
>
> Lets look at tty_open(),
>
> 2112 tty = tty_open_current_tty(device, filp);
> 2113 if (!tty)
> 2114 tty = tty_open_by_driver(device, inode, filp);
> 2115
> 2116 if (IS_ERR(tty)) {
> 2117 tty_free_file(filp);
> 2118 retval = PTR_ERR(tty);
> 2119 if (retval != -EAGAIN || signal_pending(current))
> 2120 return retval;
> 2121 schedule();
> 2122 goto retry_open;
>
> And why do we need this schedule() above? It is nop in TASK_RUNNING.
>
> If we need it for non-preempt kernels to avoid some sort of livelock
> this can't work if rt_task(current) == T.
>
> If we just want to add a preemption point then perhaps we should use
> cond_resched/might_resched ?

Yeah, it's just a preemption point that pre-dates cond_resched.


> I am not saying this is wrong, but looks confusing.
>
> 2130 if (tty->ops->open)
> 2131 retval = tty->ops->open(tty, filp);
> 2132 else
> 2133 retval = -ENODEV;
> 2134 filp->f_flags = saved_flags;
> 2135
> 2136 if (retval) {
> 2137 tty_debug_hangup(tty, "open error %d, releasing\n", retval);
> 2138
> 2139 tty_unlock(tty); /* need to call tty_release without BTM */
> 2140 tty_release(inode, filp);
> 2141 if (retval != -ERESTARTSYS)
> 2142 return retval;
> 2143
> 2144 if (signal_pending(current))
> 2145 return retval;
>
> So we can only get here if tty->ops->open returns -ERESTARTSYS without
> signal_pending(). This doesn't look good. But OK, lets forget this.

tty_port_block_til_ready() returns -ERESTARTSYS if !ASYNC_HUP_NOTIFY and
tty was hungup or the h/w was reset (eg. while waiting for tty lock).
IOW, parallel hangup() or release() happened at the point where the
tty lock was not held.


> 2147 schedule();
>
> Again, this looks confusing.
>
> 2148 /*
> 2149 * Need to reset f_op in case a hangup happened.
> 2150 */
> 2151 if (tty_hung_up_p(filp))
> 2152 filp->f_op = &tty_fops;
>
> And this is called after tty_add_file() which makes this filp visible to
> __tty_hangup(), and without tty_lock().

tty_release() has removed the filp from the files list already (with the
tty lock held), so if the file operations were reset it's because a
hangup happened when the tty lock was not held (eg. waiting for reopen
or after dropping the lock and reacquiring it in tty_release())


> How can we trust this check? __tty_hangup() can set hung_up_tty_fops()
> right after tty_hung_up_p() returns F.

__tty_hangup() will not have had visibility to this filp after
tty_release().


> Don't we need something like below? Otherwise afaics tty_open() can
> actually succeed (and clear TTY_HUPPED) after restart, but return with
> tty_hung_up_p(filp) == T.

No.

Regards,
Peter Hurley



> --- x/drivers/tty/tty_io.c
> +++ x/drivers/tty/tty_io.c
> @@ -2122,6 +2122,13 @@ retry_open:
> goto retry_open;
> }
>
> + /*
> + * This is only possible after retry_open, we drop tty_lock()
> + * without tty_del_file().
> + */
> + if (tty_hung_up_p(filp))
> + filp->f_op = &tty_fops;
> +
> tty_add_file(tty, filp);
>
> check_tty_count(tty, __func__);
> @@ -2145,11 +2152,6 @@ retry_open:
> return retval;
>
> schedule();
> - /*
> - * Need to reset f_op in case a hangup happened.
> - */
> - if (tty_hung_up_p(filp))
> - filp->f_op = &tty_fops;
> goto retry_open;
> }
> clear_bit(TTY_HUPPED, &tty->flags);
>