linux-2.1.98 bind returns EAGAIN: bandaid fix

Adam J. Richter (
Sat, 2 May 1998 09:18:35 -0700

An ISP customer of ours recently moved from 2.0.x to 2.1.98
on a shell account machine because they were making it a multiprocessor.
Anyhow, after about a dozen users would log in, they would not be able
to make outgoing tcp connections. Telnet and FTP would fail, and
ptrace revealed that these program would always get the error EAGAIN
from the bind(2) system call. It turns out that this was caused by
tcp_good_socknum() in linux-2.1.98/net/ipv4/tcp_ipv4.c failing to
find an available socket number, even though only a very small number
of sockets were in use on the system. Somehow, every available
socket number had a tcp_bind_bucket associated with it, but that
bucket would have a null "owners" field. Anyhow, putting a call
to tcp_bucketgc() at the begining of tcp_good_socknum() made the
problem go away, but I am sure that that is not exactly right fix,
since the point in having a hashing scheme is to go fast. However,
this band aid fix has now put the machine that was experiencing
this problem back into production, and I have not seen the problem
anywhere else. So, I'm passing this information on in case it
is helpful to anyone tracking down 2.1.x networking bugs.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 205 \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to