Bug in filesystem code

Bernd Schmidt (crux@Pool.Informatik.RWTH-Aachen.DE)
Mon, 10 Mar 1997 14:53:12 +0100 (MET)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Neptho: "GNU Libc 2.0.1 (Linuxthreads, Crypt), with existing a.out binaries"
Previous message: Donald R. Harter Jr.: "Re: SCSI reset on 2.1.27 with MCA (microchannel- level triggered interrupts)"

There have been a few reports on this list last week that some filesystem-
stressing shell scripts can cause Oopses or other trouble. I can not reproduce
an Oops, but I get another problem which convinces me there is a bug in the
kernel code. I can reproduce it on two different machines, both of which
run absolutely stable:
1. i486-linux, running kernel 2.0.25 or 2.0.29, IDE drive
2. i586-linux, running kernel 2.0.29, SCSI drive on aic7xxx controller.

Do the following: Untar cvs-1.9 and compile it (anything else will probably do,
this is just to get a reasonably full directory tree), then create an empty
directory and run the following shell script:

while true; do
cp -rvf cvs-1.9 foo &
sleep 25
done

The sleep must be carefully timed for your system so that it finishes before
the cp does. This means that the load will gradually go up the longer this
script runs. If this doesn't work for you, try using a different directory
tree than cvs-1.9 and/or different sleep times - it's kind of sensitive.

After a while (for me, at the point where about 10 cp's were running
concurrently), I got the following kernel message:

/var/log/syslog:Mar 6 20:05:46 willwink kernel: EXT2-fs error (device 03:07): ext2_read_inode: bad inode number: 0

I then added a null pointer dereference to ext2_read_inode to get a stack
trace and repeated the process.

>>EIP: 14e7e2 <ext2_read_inode+42/310>
Trace: 122283 <__iget+123/1e0>
Trace: 150251 <ext2_unlink+71/220>
Trace: 12ac87 <do_unlink+d7/e0>
Trace: 12acb6 <sys_unlink+26/40>
Trace: 10a6e9 <system_call+55/7c>

I hope this helps in tracking it down.

Bernd

Next message: Neptho: "GNU Libc 2.0.1 (Linuxthreads, Crypt), with existing a.out binaries"
Previous message: Donald R. Harter Jr.: "Re: SCSI reset on 2.1.27 with MCA (microchannel- level triggered interrupts)"