Possible Linux/Solaris 7 TCP bug

Ilpo Ruotsalainen (lonewolf@cs.hut.fi)
Thu, 12 Aug 1999 23:37:08 +0300 (EET DST)


Notice: I'm not on the list, please reply directly or at least Cc reply to
me.

Same setup as in my previous problem (except that the 2.2.11 was patched
with the patch at http://www.linux.org.uk errata-page). No longer memory
leaks but now I get strange behaviour between the Linux and the Sun machines.
The machines start flooding the network madly, doesn't help any if I kill
the server process or anything, only way to stop it is to pull the network
cable and not replace it before the sockets have really closed.

Actually I think this happened on the previous tests too if I tried to stop
the scripts that run the clients before my Linux ran out of memory but I'm
not 100% sure. Tcpdump trace at http://www.cs.hut.fi/~lonewolf/tcpdump.gz

Some info from the Linux machine:

netstat -ta:
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 gunbuster.cs.hut.f:1022 ema.cs.hut.fi:ssh FIN_WAIT2
tcp 0 0 gunbuster.cs.hut.f:2252 ema.cs.hut.fi:4001 TIME_WAIT
tcp 0 0 gunbuster.cs.hut.f:2042 ema.cs.hut.fi:4024 TIME_WAIT
tcp 0 0 gunbuster.cs.hut.f:1485 ema.cs.hut.fi:4002 TIME_WAIT
tcp 0 0 gunbuster.cs.hut.f:1318 cyberneti.cs.hut.f:4011 TIME_WAIT
tcp 0 0 gunbuster.cs.hut.f:1286 ema.cs.hut.fi:4013 TIME_WAIT
tcp 0 0 gunbuster.cs.hut.f:1023 ema.cs.hut.fi:ssh ESTABLISHED
tcp 0 0 *:ssh *:* LISTEN

vmstat 1 feeds lines like this:
0 0 0 0 231844 7016 6580 0 0 0 0 16458 8 0 1 99

ifconfig:
RX packets:3880215 errors:0 dropped:0 overruns:0 frame:0
TX packets:4435463 errors:0 dropped:0 overruns:0 carrier:0

ifconfig 5sec later:
RX packets:3945157 errors:0 dropped:0 overruns:0 frame:0
TX packets:4500379 errors:0 dropped:0 overruns:0 carrier:0

I could not get info from the Sun machines since I don't have console access
to them right now and can't ssh to them with the network flooded (previously
opened ssh connections are very lagged, sometimes takes almost 30sec for
anything I type in them to get through).

This _might_ be Solaris bug since I just noticed that cyberneti doesn't
work too well anymore, trying to run anything causes "Exec format error".
One earlier test ended with ema in this state, managed to say something like
"cannot load module elf<something>" if I remember right.

SunOS ema 5.7 Generic_106541-05 sun4u sparc

cyberneti is exactly similar machine.

This is hindering my work quite badly so I'd be very grateful if some TCP
guru could take a look at the trace and tell me which OS is the one in fault
(for which one I'll need to wait for a patch, that is).

Thank you in advance and sorry for spamming the list if this is irrelevant.

--
Ilpo Ruotsalainen - <lonewolf@iki.fi> - http://www.iki.fi/lonewolf/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/