Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3

From: Evgeniy Polyakov
Date: Thu Feb 22 2007 - 08:35:42 EST

On Thu, Feb 22, 2007 at 01:59:31PM +0100, Ingo Molnar (mingo@xxxxxxx) wrote:
> * Evgeniy Polyakov <johnpol@xxxxxxxxxxx> wrote:
> > It is not a TUX anymore - you had 1024 threads, and all of them will
> > be consumed by tcp_sendmsg() for slow clients - rescheduling will kill
> > a machine.
> maybe it will, maybe it wont. Lets try? There is no true difference
> between having a 'request structure' that represents the current state
> of the HTTP connection plus a statemachine that moves that request
> between various queues, and a 'kernel stack' that goes in and out of
> runnable state and carries its processing state in its stack - other
> than the amount of RAM they take. (the kernel stack is 4K at a minimum -
> so with a million outstanding requests they would use up 4 GB of RAM.
> With 20k outstanding requests it's 80 MB of RAM - that's acceptable.)

I tried already :) - I just made a allocations atomic in tcp_sendmsg() and
ended up with 1/4 of the sends blocking (I counted both allocation
failure and socket queue overflow). Those 20k blocked requests were
created in about 20 seconds, so roughly saying we have 1k of thread
creation/freeing per second - do we want this?

> > My tests show that with 4k connections per second (8k concurrency)
> > more than 20k connections of 80k total block in tcp_sendmsg() over
> > gigabit lan between quite fast machines.
> yeah. Note that you can have a million sleeping threads if you want, the
> scheduler wont care. What matters more is the amount of true concurrency
> that is present at any given time. But yes, i agree that overscheduling
> can be a problem.

Before I started M:N threading library implementation I checked some
threading perfomance of the current POSIX library - I created simple
pool of threads and 'sent' a message between them using futex wait/wake
(sema_post/wait) one-by-one, results are quite dissapointing - given
that number of sleeping threads were about hundreds, kernel rescheduling
is about 10 times slower than setjmp based (I think so) used in Erlang.

Above example is not 100% correct, I understand, but situation with
thread-like AIO is much worse - it is possible that several threads will
be ready simultaneous, so rescheduling between them will kill a

> btw., what is the measurement utility you are using with kevents ('ab'
> perhaps, with a high -c concurrency count?), and which webserver are you
> using? (light-httpd?)

Yes, it is ab and lighttpd, before it was httperf (unfair on high load
due to poll/select usage) and own web server (evserver_kevent.c).

> Ingo

Evgeniy Polyakov
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at