Re: ext2 filesystem corruption?!?!?? (fwd)

Keith Rohrer (
Sat, 5 Apr 1997 14:18:09 +0000 (GMT)

> > | A return of 0 proves its neither hardware or ext2, but a failure does not
> > | indicate anything.
> >
> > I think a return of 0 proves that the hardware/driver/cabling/etc works
> > okay. badblocks doesn't go through the ext2fs filesystem, though, so it
> > doesn't prove that the filesystem driver is bug-free. If you get errors,
> > on the other hand, it can't be from ext2fs, so it must be one of the
> > hardware/driver/cabling/etc things involved.
> Correct. And it's very useful information to have at that. If you can
> produce corruption problems without going through the ext2fs code, then you
> have hardware corruption of some sort. An example of some of the things in
> the past that I have personally seen cause hardware corruption which made one
> *THINK* that something was wrong with the ext2fs code when there wasn't:

[bad hardware or chipset configuration examples snipped]

> And of course, the very reason I posted my original email as part of this
> thread. A person needs to always keep in mind that if they are getting ext2fs
> errors about corruption, this does *NOT* always mean the ext2fs is at fault.
[snip]> It is
> important in these cases to try and isolate software faults from hardware
> faults.
> Does anyone else here think that maybe this thread ought to be saved and
> turned into an ext2fs_corruption FAQ in the linux documentation? It seems
> like every so often this thread pops up with similar results. Maybe one line
> of code gets changed here or there (sometimes), but usually, the person in
> question has some hardware problems causing the grief.
I think that there ought to be one big "hardware problems causing software
errors" FAQ, or even a broader "error diagnosis" FAQ. After all, if we
didn't presume GCC was compiled correctly, those "signal 11" errors
wouldn't "automatically" indicate bad hardare either. The only difference
between an invalid instruction exception killing a compiler and a filesystem
getting screwed up by bad hardware is what piece of software/data was
screwed up. Error message interpretation seems to be an acquired art, and
it's time it became a taught skill...