Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ...

From: Marco Colombo (marco@esi.it)
Date: Fri Jan 28 2000 - 15:37:23 EST


On Fri, 28 Jan 2000, Davide Libenzi wrote:

[...]

> My Mother taught me two basic things :
>
> 1) Have always respect of people that have respect of other people dignity

[no comment]

I haven't read any book on modern CPUs (the last one was on the Z80) B-).
IMHO, this has little to do with my "dignity".

> 2) Always go in deep when analyzing things, and never stop at a superficial
> view

just to be fair...

Speaking of being superficial, you should at least *listen* to what (many!)
people keep telling you.

[...]

> The first data sheet I've studied was the Intel 8088 one in fall 1984, and
> I've continued until now since electronics and computer science is my
> passion
> as long as my work.
> I don't want to bore other guys with cache lines architectures, computers
> layouts, etc ...,
> as long as I don't see this list as a way to recite data sheets to show
> others own knowledge.
> Reading data sheets don't makes You a guru, but simply a reader.
> Using the gray material that You've inside that sphere that You've upon Your
> shoulders,
> this makes You a guru.

Please point me to the code you've written, instead of the books you've
read. This way i can get a better picture of your ideas on OS designing.

[...]
> Horst say :
> > Again: Low load is << 1; normal load is 1 or thereabouts; 2 or more is
> high
> > load; 10 or so is already ridiculously overloaded (UP case, for SMP
> roughly
> > multiply by #CPUs). This has been shown here with real world data from
> > higly loaded servers all over the place.
>
> How many switch / sec will do a system with RQ << 1 ?
> Stay big say 500 ( keep this number in mind ).
>

??? where do this figure come from?

Let me state this: switch rate has *nothing* to do with load, nor does
the RQ. [ And RQ is not the same of context switching rate, BTW ].

As someone may have already pointed out, you can have a server with 5
clients which shows a RQ of 5, if the backend server is trivially
implemented with the basic accept()/fork() framework, or a RQ of 1
(and NEVER MORE, even with 1000 clients) if implemented with a clever
select() loop (there are clever ways to do the fork() thing, too).
Note that the load on the server is the same, and (with 5 clients) also
the performances will be nearly the same.

Think of a web server with a very small static web site on a RAM disk,
and 20 clients continuosly requesting pages (a typical silly benchmark
load, and by no means a real world case). Different server implementations
give different average RQ lenghts. I mean really different ones, from
just 1 to around 20! And different switching rate. But the load placed
on the system is the same! On a somewhat high-end system you may even
get the same throughput.

So please stop stating that:

                 long RQ <=> high loads (1)

And please don't go on with other arguments until you've somewhat proven
(1) is true for any real world case.

That's 1 person [me] telling you that (1) is false.

> Jamie say :
> > Real world problems do not have that kind of load. Real world problems
> > _do_ have cache issues.
> > The cache issue is so important for real applications that optimising
> > the scheduler under unusual RQ loads isn't worth doing.

That's 2.

> Rik say :
> > > Real world problems do not have that kind of load. Real world
> > > problems _do_ have cache issues.

That's 3.

> > Indeed. We should optimise the scheduler for these, not
> > for the fastest possible switch times.

That's 4 [Larry, i think].

That should be enough for you to start *listening* to what these people
are saying... if you don't agree, just say why.

I can't think of any RL example in which a long RQ does not come out of
a bad design (including the use of a underpowered HW in the first place,
of course). I've said REAL LIFE EXAMPLE... not just a kind of benchmark.

IMHO, the following is true (again 99% of cases, maybe):

                long RQ <=> broken design (somewhere) (2)

for "long RQ" i mean > k * N, where N is the numember of CPUs, and
k somewhere in the 1 - 3 range. The bigger is N, the smaller should be k.
For 16 CPUs, i'd set k well under 2 (i.e. the RQ should be near N).
On UP, you may get slightly longer RQs, if you are resource-constrained.

Moreover, i'd also state that:

                long RQ <=> something very bad happening (3)

and scheduler being under-optimized for that RQ size is NOT your
biggest problem in such a situation. The system is over-loaded.
CPU caches are probably being constantly thrashed.

I see that (3) is also what Jamie, Rik and Larry said to you. Againg,
you've made no arguments against that. Please give any RL example
where you have just scheduler overhead (because of a long RQ) and
at the same time everything else is working fine.

[analysis skipped]

> The data cache cost of the patch with RQ < "Cluster scheduler boot point" is
> a variabile
> onto the stack ( 4 bytes ), that thinking on about, I can even remove.
> The code cache cost of the patch with RQ < "Cluster scheduler boot point" is
> 0,
> due to the fact that the code is the same of the previous scheduler.
> There is the block of code of the cluster scheduler that can be moved out of
> fast path,
> but I think that It'll be better to keep it on fast path since we need it
> faster under
> high workloads not with RQ < 2.

Againg, you can have high workloads with RQ ~= 1!

> But all of this topics would be clear if that guys that shoot over my patch
> cache misses,
> would have readed it.

You should be reading, in the first place...
The cache issue mainly cames into play when the system is LOADED... and:
- this can happen with RQ being small. The scheduler should be optimized
  for that (and actually is), since this is the NORMAL condition under
  high loads;
- by the time you get a long RQ, the system is so overloaded (in real cases)
  that the cache issue is a MAJOR one anyway, no matter the scheduler...

And, yes, I can think of a way to enlarge the RQ without other big
system impacts. Just spawn N identical processes (or threads, whatever),
doing just endless for(;;); and you'll see you RQ go to N.
This is NOT a RL workload. And anyway you'll patch will have little
effect on this. The scheduler gets invoked only 1/timeslice times a second.
No matter how long the RQ is.

The only thing you can do to get the scheduler play a major role, is
to have the threads do *nothing*, just release the CPU as soon as they
get scheduled. I can't think of any real world application that will
come close to that behaviour, unless it's very badly designed (or very
badly used).

[...]

Please, if you want to answer to this message, give a proof of that:
 (1) may be true and
 (3) may be false,
for some real world applications.

Otherwise, the whole discussion is useless.

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:21 EST