Re: Interesting scheduling times - NOT

Larry McVoy (lm@bitmover.com)
Wed, 23 Sep 1998 15:51:21 -0600


Oliver Xymoron <oxymoron@waste.org>:
: On Tue, 22 Sep 1998, Larry McVoy wrote:
: > How many times before it sinks in: 77% variance is not cache induced. If
: > that were true, then nothing would be deterministic. You wouldn't be able
: > to say "time make" and expect to get anything like the same number two times
: > in a row, yet people do that all the time.
:
: This is not really true, especially for something like make. If you
: combine n events of average length t and std dev u, you get an event of
: average length n*t, but a smaller std dev - if you multiply a bunch of
: bell curves together, you get a tighter curve. The possible range becomes
: larger, sure.

OK, so perhaps you can explain this build of ssh-1.2.13, over NFS,
8 way parallel so there is lots of cache interference, averages out to
2500 context switches/sec:

35.67u 12.39s 0:56.18r
35.96u 11.92s 0:55.28r
35.53u 12.74s 0:53.34r
35.64u 12.40s 0:54.19r
35.73u 12.43s 0:54.05r

The claim was that cache interference would explain variations in run
times of 70-100%. My counter claim was that if that were true, things
like make would not be deterministic. I think you may be claiming that
isn't true. However, in the most likely case to show lots of variation -
a parallel build over NFS so there are lots of context switches and cache
distrubances due to all that network traffic, I'm getting variations of
less 6%. One argument could be that the system is spending all of its
time CPU bound so the interference is minimzied; I have tried to counter
that by doing it parallel and over NFS. Given 2.5K ctx switch/sec, that's
an average time slice of 400 usecs which would lead one to believe that
the processes were not running for anything like their full 10,000 usec
time slices.

I'm not about to hold up this test as scientificly perfect, but it is the
sort of thing that, given an argument that cache interference causes large
variations in run time, ought to should some large variations. Instead,
we're seeing variations of < 6%.

By the way, the local disk, cached in memory, non-parallel build times are:

34.25u 8.66s 0:44.25r
34.29u 8.34s 0:45.19r
34.18u 8.56s 0:45.42r
34.60u 8.07s 0:45.20r
34.55u 8.23s 0:45.21r

with context switch rate at about 230/second. The variation works out
to a little less than 3%.

I stand by my statement that cache conflicts, in the make case, the
parallel make case, and in Richard's test, are not the source of anything
like 70% or greater run time variances. In fact, I'd be willing to bet
that the cache conflicts cause no more than 1/10th that variation. Which
has been my point all along - the problem isn't the operating system,
it isn't the cache, it isn't the memory controller, it's the benchmark.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/