Re: memory & filesystem corruption under heavy load?

Robert L Krawitz (rlk@tiac.net)
Fri, 5 Apr 1996 10:18:48 -0500


Date: Fri, 05 Apr 1996 01:07:57 -0500
From: garth zenie <gzenie@espresso.hampshire.edu>

i can repeatedly, without fail, achieve filesystem corruption under
1.3.72-82 by unpacking two previously gunzip'd linux kernel source
trees while cat'ing eight copies of /dev/mem into seperate files
(probably don't even need to do all of that, it seems to happen by
itself just by unpacking a single source tree). running a diff over the
source trees reveals massive file damage.

What does the corruption look like?

One possible data point: when I use my fast memcpy routine I get more
corruption (this corruption does NOT look at all like the corruption I
got with buggy versions of memcpy -- that gave me blocks of null's
that were close to page boundaries, the current corruption is random
bits flipped here and there with no discernable alignment or other
pattern). I still have similar problems -- just less extensive --
without it. It's a somewhat rough data point because of the memcpy
routine, but it may be a data point nonetheless.

Also, I find that some days I get much more extensive corruption than
others (one day I had three files with corruption, another day I had a
few dozen). The difference seems to be more than could be accounted
for strictly by chance, and if I repeat the test immediately I have
the same thing happen. I think that I have more problems on very dry
days than on days with higher humidity, but I'm not positive. This
would tend to indicate a hardware problem, but it's interesting that
other people are seeing similar symptoms.

Is there anything in common here? Does everyone who's seeing problems
have, say, a Pentium, or do people with 486 and other systems see this
also? Are these older Pentiums (like mine -- my system is almost 2
years old)? Perhaps there's some sort of latent hardware problem that
only manifests itself under heavy load, and perhaps the improved
performance of more recent kernels triggers this somehow.

-- 
Robert Krawitz <rlk@tiac.net>           http://www.tiac.net/users/rlk/

Member of the League for Programming Freedom -- mail lpf@uunet.uu.net Tall Clubs International -- tci-request@aptinc.com or 1-800-521-2512