Possible bug in 2.0.0 (And later?)

Solitude (solitude@johnf.reshall.ksu.edu)
Sat, 1 Feb 1997 01:11:20 -0600 (CST)

I am writing partially to report a possible bug, and partially because I
am still in shock over what became of my ever-efficient OS...

I am running linux kernel 2.0.0 on an Acer VI15G PCI motherboard, with an
Intel Pent-100 chip. I have 40 megs of physical ram and (had) 32 megs of
swap. This system was first built in July of '96 and asside from various
reboots it has been running non-stop since that time.

I entered the room and the first thing I heard was massive HD activity.
The console had error messages streaming by, effictivley making it
unusable. I telnet'ed in and learned that syslog was reporting the
following error over and over:

Jan 31 15:52:38 johnf kernel: p_duplicate: trying to duplicate unused page

I looked through the source and it turns out this message is supposed to
read 'swap_duplicate' Anyway, back to the story... I immediatley turned
off the swap space and re-formatted and activated some new swap space on a
different drive I had. The problem seemed to be fixed. I re-booted onto
the rescue bootdisk and ran fsck and badblocks on all my filesystems. I
ran badblocks in write mode on all my swap partitions. Neither fsck or
badblocks reported any problems. I re-ran mkswap -c on all my swap
partitions and re-booted. That was about 7-8 hours ago, and the system
hasn't had any problems ever since.

Now: I'm no kernel hacker, but here are my own ameteur observations:
There is no physical problem with the ram, motherboard, disk, controller,
etc. (The controller is an AHA2940 and the ram is kingston) If this were
a hardware problem then it would have surfaced much earlier then now.
(Espically if it is a ram problem, as I regullarly pound the hell out of
the ram with various compiles) I also have built a facsimilie of this
system for several other people with the same components and no one else
has had any other problems.

After reading through the memory source-code I belive that the problem I
had was due to a corrupted map table in the swap area. I noticed the
following syslog entry that occured just before I rebooted the system:

Jan 31 15:59:26 johnf kernel: 000be600)

The only thing about the system that I ever thought was acting kind of
weird was that it seems there is about a 5 second delay betweem the
SIGTERM and SIGKILL signals during a shutdown. I have tried using the -t
parameter on shutdown, but no matter what the delay is very short.
Otherwise, the system has always been solid as a rock, aside from this one

Well, if anyone has any ideas on this, I would like to be able to have my
previous confidence about the stability of Linux. If this is a problem
that has been identified and corrected then I will upgrade my kernel

- John

