Re: Stablilizing execution time?

Linus Torvalds (torvalds@cs.helsinki.fi)
Fri, 19 Jul 1996 07:48:00 +0300 (EET DST)


On Thu, 18 Jul 1996, Hubert A. Bahr wrote:
>
> What can be done beyond running in single user mode to
> bring execution time to a more repeatable state.

That doesn't necessarily help anything. Execution time in low-load
"normal state" should be roughly the same as single user mode. As far as
the kernel is concerned it's exactly the same thing, of course, and the
only difference is that you _may_ have other processes running (but
especially on a workstation type setup there seldom are many of those,
and they seldom result in 30 seconds worth of CPU usage).

> The problem I have is wanting to test different versions of the
> same simulation for performance. All the data is computed and
> the output is identical. The object code is about 80k stripped
> but do to the data being generated the occupied memory will run
> about 4Meg. The execution time is on the order of 9 minutes with
> up to 30 seconds variation. This is on a 486DX-2/66 with a
> 256K L2 cache with 16Meg 70ns Ram.

Note that depending on what you're doing, these kinds of variations may
be simply due to things like cache effects. Most external PC caches are
direct-mapped, and that means that there can easily be bad cache effects
if the physical pages you get happen to have bad cache layout for your
problem.

Some other unixes try to avoid this by being clever about physical page
allocation: that may not actually help execution times (you can still get
bad cache access patterns), but it _does_ make execution time more
predictable. For some things predictability may even be preferable to
speed (some hard-real-time people don't like caches at all because they
can result in so much unpredictability).

The clever allocation of physical memory for cache effects (page
colouring) usually helps performance too, especially for the "nice"
access patterns (ie single large arrays in memory), so we'll probably
have to do that some day.

> I am guessing that one contribution is the load point of
> the code since it could change overlap in cache. Any thing else is
> beyond me. System time is on the order of .2 sec as reported by
> the time command. The variation noted is when running the identical
> code repeatedly and it doesn't appear to have any cyclic effect.

You are unlikely to get much system time variation, as the kernel seldom
does anything which is very susceptible to cache effects. Most of the
time difference is likely to be in user mode or in IO effects (and I
assume you have minimized those effects? I assume it's CPU-bound?)

Linus