2nd try: Linux 2.0.[0-14] TCP & PI2

Klaus Kudielka (oe1kib@oe1xtu.ampr.org)
Thu, 22 Aug 1996 21:30:28 +0000

Hello everybody,

I still can reproduce freezing TCP connections with Linux 2.0.14. The
problem is most likely related with the PI2 driver.

I have connected two Linux boxes ("oe1xtu" and "venus") with PI2 boards
and WA4DSY modems. "venus" also acts as a router to a few other hosts on
a Token Ring ("merkur" among them). All systems have Linux 2.0.14.

PI2 Token Ring
oe1xtu ----- venus ------------ merkur

The following TCP connections DO work, regardless of the amount of data

merkur -> venus (transferring data from merkur to venus)
merkur -> oe1xtu
venus -> merkur

The following TCP connections reproducibly freeze after transferring
between 100k and 400k:

oe1xtu -> venus
oe1xtu -> merkur
venus -> oe1xtu

"freezing" means the following:
1.) The "write" system call of the transmitting host sleeps forever
(unless notified by SIGINT). I observed this by stracing "rcp". "ftp"
gives the same result.
2.) In that frozen state, "netstat -t" on the transmitting host reports
"Send-Q==0" and "Rcv-Q==0" for the TCP connection in question.
3.) The receiving host just waits for new data, but doesn't receive
4.) When sending a SIGINT to the transmitting process (for instance by
pressing Ctrl-C), a few more bytes are transmitted, and the connection
is (of course) terminated.

Looking at the set of problematic connections, I'm quite sure that there
must be some fault in the interaction between the TCP stack and the PI2
driver. (If the TCP stack does transmit via a PI2 interface, it freezes
-- if it does NOT transmit via PI2, everything works ok).

Looking at the kernel sources, I found one hint: do_tcp_sendmsg()
normally sleeps when it waits for queue memory to be freed (i.e. it
calls wait_for_tcp_memory()). Maybe the PI2 driver does not free every
write buffer? Or, maybe wake_up_interruptible() is not called every time
the PI2 driver frees write memory?.

However, I found nothing suspicious in pi2.c. I might be totally wrong
with this idea.

Does anybody else have this problem?
Can anybody give me a hint what to try next?
Or, even better, can anybody suggest a fix?

Unfortunately I'm not very familiar with the implementation of the TCP
stack. And it's not easy to understand it by reading the source.

Thanks in advance,


Klaus Kudielka OE1KIB        Peter Jordanstr. 165, A-1180 Wien, AUSTRIA
oe1kib@oe1xtu.ampr.org                  http://oe1xtu.ampr.org/~oe1kib/