ext2 filesystem corruption?!?!??

Jeff Garzik (jeff.garzik@spinne.com)
Sun, 30 Mar 1997 12:37:24 -0500


This problem occurs on both 2.0.25 and 2.0.29, and is getting really
annoying.

I am running a Usenet news server, and it has several ext2 filesystems
spread across several SCSI disks and controllers. When the load reaches
such that one of the news processes is always in disk wait, the on-disk
info starts getting corrupted. The free list is the first to go,
followed by general inode chaos. e2fsck often has to restart because it
finds so many errors.

I also had this error crop up every few hours on my md RAID0 partition,
but that went away when I added more disks and controllers to the
stripe.

Has anyone found this bug in ext2? I know some others are experiencing
it. This is the most serious Linux bug I've ever encountered in the two
years I've been running it.

Any help or patch suggestions are *greatly* appreciated. Right now I'm
running 2.0.25 since the problem seems less severe than 2.0.29. Can I
back out to an earlier, more stable Linux 2.0.x version? Is there a
patch floating around that fixes this?

Jeff