Re: Auto-Adaptive scheduler - Final chapter ( the numbers ) ...

From: Jamie Lokier (lkd@tantalophile.demon.co.uk)
Date: Fri Jan 28 2000 - 18:49:07 EST


Marco Colombo wrote:
> Davide Libenzi wrote:
> > 1) Have always respect of people that have respect of other people dignity
>
> I haven't read any book on modern CPUs (the last one was on the Z80) B-).
> IMHO, this has little to do with my "dignity".

No, the point was that the rest of us are not respecting Davide's dignity,
because we're treating him as if he doesn't know what a cache is, or as
if he doesn't read our mails. But he does. I thought that was obvious.

The thread does have some very, how shall I say, entrenched attitudes...

> Speaking of being superficial, you should at least *listen* to what (many!)
> people keep telling you.

He did. He made it obvious in the post you just quoted.

> > How many switch / sec will do a system with RQ << 1 ?
> > Stay big say 500 ( keep this number in mind ).
>
> ??? where do this figure come from?

It's an assumption, to demonstrate a technical point.
The actual figure does not matter.

> [ And RQ is not the same of context switching rate, BTW ].

Davide knows that RQ is different from switching rate.

> Let me state this: switch rate has *nothing* to do with load, nor does
> the RQ.

which you follow by this counter-example:

> you can have a server with 5 clients which shows a RQ of 5, if the
> backend server is trivially implemented with the basic accept()/fork()
> framework, or a RQ of 1 (and NEVER MORE, even with 1000 clients) if
> implemented with a clever select() loop (there are clever ways to do
> the fork() thing, too). Note that the load on the server is the same,
> and (with 5 clients) also the performances will be nearly the same.

As you can clearly see,

   - when the first server is busy, it has a RQ of 5 and switches often
   - when the second server is busy, it has a RQ of 1 and never switches

Therefore in your example, RQ and switching rate are very closely related.

Real servers are not so simple of course. But let us stick with the
simple one.

> So please stop stating that:
>
> long RQ <=> high loads (1)

Your own example just demonstrated that long RQ <=> high switching rate

> And please don't go on with other arguments until you've somewhat proven
> (1) is true for any real world case.

I'll leave that to you...

> That's 1 person [me] telling you that (1) is false.
>
> > Jamie say :
> > > Real world problems do not have that kind of load. Real world problems
> > > _do_ have cache issues.
> > > The cache issue is so important for real applications that optimising
> > > the scheduler under unusual RQ loads isn't worth doing.
>
> That's 2.

Thanks for quoting me, but you have misunderstood it.

I did not confirm the assertion (1). I said (and really I was only
explaining others' words), that cache reload timing is more significant
than scheduler switching time.

The corollary is that the both scheduler and application should be
written to minimise cache loading first, and things like switching time
second. And that means keeping the scheduler small, and optimising
applications so they have small RQs.

I said nothing about the relationship between RQ length and switching
rates, or system load.

> I can't think of any RL example in which a long RQ does not come out of
> a bad design (including the use of a underpowered HW in the first place,
> of course). I've said REAL LIFE EXAMPLE... not just a kind of
> benchmark.

You may be right here. That is certainly what everyone on the "keep the
scheduler simple" side has said.

But IMO, Davide has cleverly pointed out that the "good applications
don't have a large RQ" assertion isn't actually relevent to the optimisation.

Why? Because all the *good* applications don't *care* if the scheduler
is simple or not. Under load, good applications *don't schedule at all*!

Now, that's not really true of web servers invoking CGI and copying the
response through a pipe to a socket, etc. If you call that an
interesting case, I agree. If you call that broken design, then we can
assume the simple single-threaded server model. So let's assume that...

In other words, you have taken A => B and deduced B => A.
We both know that's a common logical mistake.

> [snip variations on the theme "long RQ <=> broken application design"]

I hope I have managed to explain why this point is not relevant to
optimising the scheduler. (Assuming simple single-thread-is-busy
applications). From this point of view, non-broken application designs
just don't care about the scheduler at all.

> Againg, you can have high workloads with RQ ~= 1!

You can. When RQ <= 1, you don't care about the scheduler perfo

> You should be reading, in the first place...
> The cache issue mainly cames into play when the system is LOADED... and:
> - this can happen with RQ being small. The scheduler should be optimized
> for that (and actually is), since this is the NORMAL condition under
> high loads;

Ah, but is the normal condition that the scheduler isn't entered at all
under that kind of high load?

I don't think it is, because real servers cobble together complicated
stuff, but it is true in the select() based server example.

> - by the time you get a long RQ, the system is so overloaded (in real cases)
> that the cache issue is a MAJOR one anyway, no matter the scheduler...

Now now, that isn't true if all the processes are scheduling around the
same memory image. At that point we might even find that it's kernel
stack cache conflicts that are most significant, in which case we might
want to colour the stack entry position a little...

> The only thing you can do to get the scheduler play a major role, is
> to have the threads do *nothing*, just release the CPU as soon as they
> get scheduled.

> I can't think of any real world application that will
> come close to that behaviour, unless it's very badly designed (or very
> badly used).

If you have threads doing sendfile, or write from already mapped files,
there will be relatively little icache footprint and it will be shared
between the different threads.

The dcache footprint will be identical whatever the server architecture.
Threads or not. (Related request clustering not being considered
here.., that's an orthogonal thing).

I have written a more coherent summary of the technical argument in
another message, subject "On optimising the scheduler for large run
queues".

be well,
-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:22 EST