Re: Possible problem with ACCEPT/CONNECT in

From: Stephen Richard Ives (ives@andrew.cmu.edu)
Date: Sat Apr 08 2000 - 17:21:20 EST


On Fri, 7 Apr 2000, Peter Zaitsev wrote:

> Resently I wrote to this list about problem with strange slowdown on network
> connections. Now I have done some investigations and must say the problem
> really exists and repeatable !. Both 2.2 and 2.3 kernels are atthected in
> SMP and non-SMP configuration.

I've noticed the same problem between hosts (2.2.14-SMP server, 2.2.12-UP
client) on my 100Mbps LAN while testing performance of various
self-coded webservers for a class.

> The problem is then many clients are trying to connect the same port, they
> sometimes get locked for a long time even then CPU usage on server is low
> and it is able to do needed number of accepts in a time.
> The problem persists the client and server are on the same or on different
> machines, and even then number of server processes doing accept() is grater
> then the number of clients running.
> The next strange the times taken by connect() - if it does not pass
> momentaly it gots locked for 3 9 21 45 ... seconds.

I didn't actually take any measurements at the time, but xosview showed
the cpu and network idle for periods of time while running the tests, and
then it would suddenly pick back up. It would go idle for up to maybe a
minute.

> I can't get this problem 100% repeatable. The funny thing is on the same and
> same loaded machine sometimes I got the problem with the same program and
> sometimes I do not, so the only way I was able to repeat it is to play with
> parameters a bit (number of listening and connecting process).
> The other interesting effect I got is getting error #11 then not expected
> to - then for example number of listening processes is 10 each haveing back
> buffer at least 5 and 11-12 processes are connecting - some of them are
> getting an error...

Same here. It's not 100% repeatable, but it did happen occasionally over
my tests. I'll have to go back and try it again and see if I can find
where the hang-up is (client/server, kernel/userspace, etc).

> I attach the scripts I used to reproduce this problem. Or please tell me if
> this is an expected behavior.
> As server script I tryed to use not only hand written program but others -
> for example APACHE - the problem persist.

<snip>

I can provide the source code to my clients and servers if requested. And
before anyone starts flaming me as well, this happens under all four
combinations of servers - forking, preforking, threaded and prethreaded.
So don't go blaming the server design unless there really is a major bug
in my implementation :)

-Steve

   Stephen R. Ives (Grad. ECE BS/MS, CS BS) | We are CMU students of Borg.
   Carnegie Mellon University | Sleep is irrelevant.
   email: ives+@andrew.cmu.edu | You will be caffeinated.
   http://www.subatomic.org/~ives/ |

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:12 EST