[2.4 TTY] Lost wake-up of tty->write_wait due to race?

From: Davda, Bhavesh P (Bhavesh)
Date: Tue Feb 15 2005 - 17:42:01 EST


Sorry for the repost! I forgot to attach my analysis so far ...

In the 2.4 kernels, is there a race window when a wake_up() of the
tty->write_wait queue from the underlying tty_driver can be lost in
la-la-land, while a task is sleeping on it from tty_wait_until_sent() ?

I am seeing something similar. I have the "pppd" daemon in the
TASK_INTERRUPTIBLE state stuck in tty_wait_until_sent() [determined from
wchan] as a result of its ioctl(fd, TIOCSETD, [N_TTY]) call while about
to exit.

The debug module is showing that the tty->write_wait queue is empty. As
a result, *nothing* will wake up pppd from its TASK_INTERRUPTIBLE state.

How could that be? Just to show that I've done my homework, attached is
my analysis of the issue.

I've seen references to race conditions in the changing of
line-disciplines in postings by Alan Cox, but none of those references
seems to explain what I am seeing.

Any insight will be greatly appreciated!

Thanks

- Bhavesh

Bhavesh P. Davda | Distinguished Member of Technical Staff | Avaya |
1300 West 120th Avenue | B3-B03 | Westminster, CO 80234 | U.S.A. |
Voice/Fax: 303.538.4438 | bhavesh@xxxxxxxxx
pppd:
Calls ioctl(modemfd [/dev/usb/ttyACM0], TIOCSETD, (N_TTY))
In the kernel:
tty_ioctl() calls tty_wait_until_sent(tty, 0)
[ drivers/char/tty_io.c:1735 ]
tty_wait_until_sent(tty, 0)
checks tty->driver.chars_in_buffer func ptr. If true,
Adds pppd to the tty->write_wait wait queue
Calls tty->driver.chars_in_buffer(tty). If true,
Translates to acm_tty_chars_in_buffer()
Returns 0 if acm->writeurb.status != EINPROGRESS
Returns -EINVAL if !ACM_READY()
Returns acm->writeurb.transfer_buffer_length if
acm->writeurb.status==EINPROGRESS
Calls schedule_timeout(MAX_SCHEDULE_TIMEOUT)

To wake up pppd from schedule_timeout(MAX_SCHEDULE_TIMEOUT), someone has
to wake_up the tty->write_wait wait queue
One candidate to do so is acm_softint()
acm_softint() is called in bottom-half of acm_write_bulk()
acm_write_bulk() is called as a completion routine for the write URB
Most likely happens in interrupt context, as the outstanding
UHCI TD is completed. That's why it queues a IMMEDIATE_BH task
to do acm_softint()

POTENTIALS FOR RACE CONDITION:
Another task does a write() on /dev/usb/ttyACM0
Translates to acm_tty_write()
acm->writeurb.tranfer_buffer_length = count passed in
usb_submit_urb(&acm->writeurb)
UHCI driver writes TD to USB controller
If pppd does its ioctl(TIOCSETD) *after* usb_submit_urb() from other task,
then its tty_wait_until_sent(tty, 0) routine will get a non-zero
acm_tty_chars_in_buffer(). By that time, pppd has added itself to the
tty->write_wait queue. So on completion of the TD/URB, acm_softint()
*should* wake up pppd correctly

What happens if the TD completion interrupt (and hence the call to
acm_write_bulk()) comes in right before pppd calls
schedule_timeout(MAX_SCHEULE_TIMEOUT) ?
It queues an IMMEDIATE_BH task to do acm_softint(), which won't be executed
until *after* the schedule_timeout() call from pppd (*I THINK*, due to the
non-preemptible nature of the 2.4.20 kernel). acm_softint() *should* be
executed by the ksoftirqd kernel thread, which runs *after* pppd has called
schedule_timeout(). So the wake-up of tty->write_wait *should* wake up pppd
correctly


What can I do to see if pppd is still on the write_wait queue for the tty?

Write a driver whose init routine will create a tty, and examine its
write_wait queue.