More bad news on 2.0 stability. Scary stuff

Bernd Schmidt (crux@Pool.Informatik.RWTH-Aachen.DE)
Wed, 2 Apr 1997 10:48:04 +0200 (MET DST)

I was looking at mm/filemap.c for the last few days because I'm working on a
patch to use the page cache for writes. While looking at the code, I noticed
a few potential problems.
I wrote the following test program to verify a bug I was suspecting:
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <stdlib.h>

/* Define this much larger than your RAM size so you really hit swap */
#define RAMSIZE 30*1024*1024

int main()
int fd = open ("testfile", O_RDWR|O_CREAT|O_TRUNC);
char *map = mmap (0, RAMSIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
lseek (fd, RAMSIZE-1, SEEK_SET);
write (fd, &fd, 1);
for (;;) {
int pos = random() % RAMSIZE;
int pos2 = random() % RAMSIZE;
int len = random() % RAMSIZE;
int mode = random() & 1;
if (pos + len > RAMSIZE)
len = RAMSIZE-pos;
if (pos2 + len > RAMSIZE)
len = RAMSIZE-pos2;
if (mode)
memset (map + pos, random(), len);
else {
lseek (fd, pos, SEEK_SET);
write (fd, map + pos2, len);
Run this on an 8MB machine (or give your kernel the option "mem=8m") running
kernel 2.0.29. It will lock up the whole machine very quickly.

As I see it, the problem is that in filemap.c:filemap_write_page(), the inode
semaphore is down()ed for the inode of a page that is to be swapped out. The
swapout code can be called from basically anywhere when the system is low on
memory, including from the fs code. The the scary part is, the same inode
semaphore may already be locked down, and the really scary part is: it may be
locked down by the same process => the process deadlocks with itself.

I didn't expect this program to lock up the whole machine, though, I rather
expected that only this process would get stuck. But I also got kswapd and
an agetty stuck in "D" state after running it. I can't quite explain that yet.

I'll try to come up with a fix for this soon. I think the higher-level swapping
code should have a look at the inode if it's trying to swap out a page cache
page (a "trydown" function for inodes could be very useful).