Re: Overscheduling DOES happen with high web server load.

Greg Lindahl (lindahl@cs.virginia.edu)
Fri, 7 May 1999 11:52:01 -0400 (EDT)


> Ok, you are right. The real problem is we are calculating goodness
> O(A*B).
>
> A= Number of processes on the runqueue
> B= Number of times schedule is called

This is the amount of work we are doing, but I think you're on to the
wrong solution. Phil did a test: he patched schedule() to pick the
first schedulable process instead of the best one. That dropped the
amount of time consumed in schedule() from 20% to 1%. That's the same
as reducing the work from O(A*B) to O(B).

Why is schedule called so frequently? The thundering herd. You get a
new connection, everyone wakes up, only one gets work, and everyone
else goes back to sleep, each causing O(A) work to reschedule. M hits
per second, N processes, B=M*N. A=N. So the total work is O(M*N^2).

These are separate problems. The thundering herd is fixed by
wake_one. The cost of scheduling is still a problem; if we had
many-cpu SMP linux boxes with high loads, all the cpus would sit
around waiting for the scheduler lock. Of course, there would be other
problems too. Since we can't fix all possible thundering herd
situations, I think we should fix the scheduler too.

Phil, did the SpecWeb score rise with the patch? It should have. Of
course, that hacked scheduler is pretty broken...

-- g

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/