Auto-Adaptive scheduler - Final chapter ( the numbers ) ...

From: Davide Libenzi (davidel@maticad.it)
Date: Wed Jan 26 2000 - 17:36:50 EST

Next message: Guest section DW: "Re: 2.3.40 loop-device-behaviour"
Previous message: Purtell, Andrew: "anyone remember a script for determining max kstack allocation?"
Next in thread: Larry McVoy: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Reply: Larry McVoy: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Alex Khripin: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Vandoorselaere Yoann: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi guys,

these are the numbers ( benchmarks ) of the Auto-Adaptive scheduler patch
applied to the 2.2.14 kernel.

[First of all]
I'd like to thanks all guys reported me their results of the patch.

[Second of all]
My initial enthusiasm about the performance improvement the patch will give
was due a skew from the source code of the software I've used to test and
the program itself.
Time ago I've modified the code to weight the total switches with seconds of
execution, and inside threads.c ( that I include with the message ) I've
seen :

value /= (int) difftime(te, ts);

The compiled program I've used to test was the old version that reports the
total number of switches.
Now, since I used to run tests for 30 secs I've measured 30 times the real
switch time.
This induced me that under SpecWeb96 vmstat loads ( RQ = 20 , 13000 switches
/ sec )
the 50% of the bench time was spent into the scheduler.
For about RQ = 20 and 13000 switches / sec I've estimated about 7% of
scheduler cost.
Now under that loads a 50% faster schedule() will result into a 3.5% global
performance improvement.
I've reports of 8 way SMP machines with RQ = 50 and 25000 switches / sec
reached under "normal" use ( not benchmarks ).

[THE NUMBERS] ==> PII 400 MHz 128 Mb RAM

A) COST OF PATCH

1) Performance

The RQ = 2 case give me :

Old = 759000 switches / sec = 1.317 us
New = 735000 switches / sec = 1.360 us

This result in a lower speed in _switching_time_ of 3.16 %.
But the switching rates of a system with RQ = 2 are more less than 759000 /
sec.
The analysis of even highly load RQ = 2 systems shows that the switch rate
remain
under 2000 switches / sec ( even less ).
So we have :

Old = 2000 * 1.317 us = 2.635 ms
New = 2000 * 1.360 us = 2.721 ms

So at every second we loose :

dT = 2.721 ms - 2.635 ms = 0.086 ms ( in a second )

DT% = ( 0.086e-3 / 1 ) * 100 = 0.0086 %

So in the RQ = 2 case we have a performance lacks of 0.0086 %.
With RQ = 3 we have a performance lacks of 0.0036 %.
With RQ = 4 we have exactly equal performances.

2) Code

The patch add about 180 lines of code to sched.c including comments,
less than 10 in sched.h and less than 5 in fork.c.

B) VANTAGES OF PATCH

1) Performance

Some guys has reported performance boosts of about 10 % with 30 processes in
RQ
and switch rate of about 12000.

Others running SpecWeb96 that loads its machines with about RQ = 20 and
10000 switch / sec
has reported equals or 5 % performance boost.

Where must we go to get boost ?

For RQ = 30 I've measured :

Old = 190000 switches / sec = 5.26 us
New = 220000 switches / sec = 4.54 us

dT = 0.72 us
DT% = ( 0.72 / 5.26 ) * 100 = 14 % ( of the switching time not overall
time )

To get the overall time boosts we must evaluate how much cost the scheduler
in a second.

Suppose You've a system that makes 20000 switch / sec, we've :

ST = 5.26 us * 20000 = 0.11 sec
ST% = ( 0.11 / 1 ) * 100 = 11 %

So the overall gain will be :

OG% = 11 % * 14 % = 1.5 % ( RQ = 30 and 20000 switch / sec )

For RQ = 50 and 35000 switch / sec ( we probably need SMP ;) ) :

Old = 95000 switches / sec = 10.5 us
New = 120000 switches / sec = 8.3 us

dT = 2.2 us
DT% = ( 2.2 / 10.5 ) * 100 = 21 % ( of the switching time not overall time )

ST = 10.5 us * 35000 = 0.37 sec
ST% = ( 0.37 / 1 ) * 100 = 37 %

So the overall gain will be :

OG% = 37 % * 21 % = 7.8 % ( RQ = 50 and 35000 switch / sec )

For RQ = 100 I've a 35 % faster __switching__ time while, for RQ = 200 I've
got
it 60 % faster.

Now given these numbers, my question are :

1) Exist "in nature" loads that can get advantages from this patch ?
2) Can today an SMP machine have RQ = 100 and 40000 switch / sec on users
systems ?
3) Will can tomorrow ?
4) Will plan Linux to scale on SMP N > 4 machines, or given the kernel
nature
is better to invest in clusters ?

I'm all but sure !

As a italian bald man use to say, "Good work to everyone !",
Davide.

-- "A men that know, whisper, the one that don't, shout."

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Guest section DW: "Re: 2.3.40 loop-device-behaviour"
Previous message: Purtell, Andrew: "anyone remember a script for determining max kstack allocation?"
Next in thread: Larry McVoy: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Reply: Larry McVoy: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Horst von Brand: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Alex Khripin: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Maybe reply: Vandoorselaere Yoann: "Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:17 EST