Re: 2.1.106 stresstest event [LONG]

Nicholas J. Leon (nicholas@binary9.net)
Fri, 19 Jun 1998 13:06:08 -0400 (EDT)


On Fri, 19 Jun 1998, Michael L. Galbraith wrote:

# After running a couple of make -j bzImage, I decided that a longer
# running make would save me having to restart it so often, so I started
# a make -j in glibc-2.0.94 and continued xfering 1gig chunks to my
# laptop (and restarting when it finished).

I also decided to do some stress testing on 2.1.106 box last night. I ran
this:

[/sbin/scripts/torture]

#!/bin/sh

tordir=/tmp/torture.$$
count=20

mkdir ${tordir}
cd ${tordir}

cur=0

while [ ${cur} != ${count} ]; do
mkdir ${cur}
cd ${cur}

cp -Rdvs /usr/src/linux/* .

cd ..

cur=`expr ${cur} + 1`
done

echo -n "Last chance, CTRL-C now to abort! Enter to continue! "
read

echo -n "Ok... starting ${count} builds..."

cd ${tordir}

for cur in *; do
echo -n ${cur}
( cd ${cur} && make clean oldconfig depend && sleep 10 && /bin/time make -j ) > ${tordir}/${cur}/make.log 2>&1 &
done

echo ''

There were a couple of things I noticed that I want to share:

1) Because of the backgrounding and "sleep 10", all 20 copies started the
actual build at the same time. During the builds, I noticed that all 20
copies were RIGHT IN SYNC with each other: compiling the same files at the
same time (must have been a BIG HIT for the caching effect). It was VERY
impressive to watch.

2) I ran out of VM real fast. On a 192MB + 128MB swap it died within 2
minutes. So I added some swap:

Filename Type Size Used Priority
/dev/hda2 partition 64256 2548 1
/dev/hdc2 partition 64252 2312 1
/usr2/nicholas/tmp/bigswap0 file 130748 0 -6
/usr2/nicholas/tmp/bigswap1 file 130748 0 -9
/usr2/nicholas/tmp/bigswap2 file 130748 0 -11
/usr2/nicholas/tmp/bigswap3 file 130748 0 -13
/usr2/nicholas/tmp/bigswap4 file 130748 0 -15
/usr2/nicholas/tmp/bigswap5 file 130748 0 -20

For a total memory space of

total used free shared buffers cached
Mem: 192856 190196 2660 16540 19752 140500
-/+ buffers/cache: 29944 162912
Swap: 912996 4860 908136
Total: 1105852 195056 910796

This nicely covered it :)

3) It worked. No problems*. Getting a shell was a bit annoying (I had to
wait at least 2 minutes for my root shell on another vt to swap back in).
But once I was in and reniced it to -20 it was smooth. I did some
vmstat'ing and saw some pretty amazing things, like 92 running procs, 518
blocked, 5000+ ctx switches per second, 10000+ blocks in/out - swaps
in/out. Load average off the scale (200+).

(* except for a few sk_buff allocation errors, that is)

# To check to see if there was any hidden damage

As soon as it finished, I went single user without rebooting and did a
forced check on the partitions involved in this. Boom. All those wonderful
errors (but mostly harmless in my case) we've been reading about for the
past two days.

But still, I brought the box back up to a rational runlevel and I'm still
running it as we speak.

-----------------------------------------------------------------------------
Nicholas J. Leon "Elegance Through Simplicity"
nicholas@binary9.net - - http://mrnick.binary9.net

8 4 9 1 7 3 <-- what is the pattern?
[ p6dbe-2xPII-233-udma ]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu