Weird connect() lockup

Pekka Savola (Pekka.Savola@netcore.fi)
Sat, 24 Apr 1999 19:53:30 +0300


Hi, I have been experiencing a very strange problem with 2.2.6 (and also
previous kernels). I have posted several times to a few Linux newsgroups,
but no help there. Perhaps someone of you have encountered something like
this before:

Sometimes (my guess is when there are a lot of connect() calls
simultaneously.. 30-40 perhaps) connections die very strangely _after_
that: you can't establish any TCP connection anymore. Only rebooting
works. ICMP works fine; I think UDP also works (or else I have a
really big dns cache).

Might this be a kind of anti-DOS or something?

I have two NIC's, one for the private lan (eth1; 192.168.1.2) and one
for the net (eth0; 130.233.25.176). For some reason, all connections
from the server _seem_ to use 192.168.1.2 address.. Wouldn't it be
much wiser to use eth0's IP address? Is there a way to change that?

Also, I'm using Ipchains 1.3.8, and I think this might be (partly) a
masquerading/forwarding problem.

As it happens, every connection from my masqueraded Win98 box works
fine despite the server having difficulties: only exceptions are those
that use identd (which is run on the server) or similar applications..
But this only slows it down a bit (since apps don't even response
from identd), doesn't disable connections entirely.

So, the only thing affected seems to be connections from the server
which _seem_ use the local lan address to connect to remote sites.

If I take the local lan interface down with ifconfig after the crash,
it still doesn't work.

Does anyone have _any_ idea where to look for the problem or to fix
it? This is getting very frustrating.

Here are some, hopefully relevant, parts of 'strace ftp ftp.funet.fi':
[192.168.1.2 int nic, 130.233.25.176 ext nic, 128.214.248.6
ftp.funet.fi]

Notes: the non-working version freezes at connect() to ftp.funet.fi
Otherwise there doesn't seem to be much difference.

I don't know current tcp/ip implementation very well, but it feels
a bit odd with common sense if my actual outbound IP is mentioned only
after establishing connection; it always tries 192.168.1.2 first
(I guess it's cached or something since even if I took the interface
down, it still uses 192.168.1.2).

non-working connection:
-------
close(3) = 0
munmap(0x4000b000, 10652) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sin_family=AF_INET,
sin_port=htons(53),sin_addr=inet_addr("192.168.1.2")}, 16) = 0
send(3, "t\310\1\0\0\1\0\0\0\0\0\0\3ftp\5"..., 30, 0) = 30
select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {5, 0})
recvfrom(3, "t\310\201\200\0\1\0\1\0\2\0\2\3f"..., 1024,
0,{sin_family=AF_INET,
sin_port=htons(53),sin_addr=inet_addr("192.168.1.2")}, [16]) = 130
close(3) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
connect(3, {sin_family=AF_INET,
sin_port=htons(21),sin_addr=inet_addr("128.214.248.6")}, 16
------

And the working one:
-------
close(3) = 0
munmap(0x4000b000, 10652) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sin_family=AF_INET,
sin_port=htons(53),sin_addr=inet_addr("192.168.1.2")}, 16) = 0
send(3, "g\337\1\0\0\1\0\0\0\0\0\0\3ftp\5"..., 30, 0) = 30
select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {5, 0})
recvfrom(3, "g\337\201\200\0\1\0\1\0\2\0\2\3f"..., 1024,
0,{sin_family=AF_INET,
sin_port=htons(53),sin_addr=inet_addr("192.168.1.2")}, [16]) = 130
close(3) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
connect(3, {sin_family=AF_INET,
sin_port=htons(21),sin_addr=inet_addr("128.214.248.6")}, 16) = 0
<----- the non-working version froze here ------->
getsockname(3, {sin_family=AF_INET,
sin_port=htons(1028),sin_addr=inet_addr("130.233.25.176")}, [16]) = 0
setsockopt(3, IPPROTO_IP, 1, [16], 4) = 0
------
Pekka Savola pekkas@netcore.fi

---
Across the nations the stories spread like spiderweb laid upon spiderweb, 
and men and women planned the future, believing they knew truth. They 
planned, and the Pattern absorbed their plans, weaving toward the future
foretold.		-- Robert Jordan: The Path of Daggers

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/