Re: Thread implementations...

Adam D. Bradley (artdodge@cs.bu.edu)
Thu, 25 Jun 1998 14:21:01 -0400 (EDT)


On Thu, 25 Jun 1998, Linus Torvalds wrote:

> One thing is actually the latency of setting up a small transfer. This
> sounds unimportant, but it's actually fairly important in order to do well
> under load: the lower latency you have, the more likely you are to not get
> into the bad situation that you have lots of outstanding requests and all
> while you serve those you get new requests at the same rate and never make
> any progress after a certain load.

This is actually one of the most crucial parts of really massive
server performance, since most documents are tiny. (Web documents
follow a heavy-tail distribution: most documents are tiny, but the
average byte is part of a big file. This is illustrated by, among
other things, the discrepancy between the median document size,
less than 3KB, and the mean document size, over 6Kb.)

> And web serving is one of the things a lot of people want. And if they
> make their judgements by benchmarks, we'd better be good at them. Never
> discount benchmark numbers just because you don't like the benchmark: I
> much prefer to go by real numbers than by "feeling".
>
> I know some people that every time they see Linux beating somebody at a
> benchmark, they claim that "the benchmark is meaningless, under real load
> the issues are different". That's a cop-out. If NT is better than Linux at
> something, we'd better look out or have a _really_ good explanation.. And I
> think webstone is "real enough" that we can't really explain it away.

The fact is that most benchmark software (webstone, specweb, etc) goes
easy on servers - so if NT can out-perform us here, it means we're
not doing the "easy" stuff efficiently. I, for one, am _VERY_
hesitent to rely on "emergent behaviors" showing up when the server is
put under a more realistic (bursty, self-similar, heavy-tailed, etc)
load, since that load can be roughly characterized as a massive
superimposition of those simple workloads (with different mean sizes,
inter-arrival times, etc) coupled with bursty properties (high load
is a strong indicator of continued high load hence the queue problem
Linus mentioned, high variability on all timescales, etc).

Shameless plug for a friend's web benchmarking research, SURGE:
http://www.cs.bu.edu/techreports/97-006-surge.ps.Z

Adam

--
You crucify all honesty             \\Adam D. Bradley  artdodge@cs.bu.edu
No signs you see do you believe      \\Boston University Computer Science
And all your words just twist and turn\\    Grad Student and Linux Hacker
Reviving just to crash and burn        \\                             <><
--------->   Why can't you listen as love screams everywhere?   <--------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu