Re: 2.1.124: ext2fs corruption and kernel panic

Jelle Foks (jelle@flying.demon.nl)
Fri, 16 Oct 1998 19:50:16 +0000 ( )


On Wed, 14 Oct 1998, Jakob Borg wrote:

> On Wed, Oct 14, 1998 at 02:42:32PM +0100, Stephen C. Tweedie wrote:
> > Hi,
> >
> > On Mon, 12 Oct 1998 18:38:39 +0000, Jakob Borg <jborg@df.lth.se> said:
> >
> > > About two minutes before this I had begun to notice strange behavior,
> > > Makefiles and config.h in a project I was working on suddenly became full
> > > of random binary crap.
> >
> > Random memory corruption: oh dear.
>
> It wasn't memory corruption, it was corrupted _on_ _disk_, several reads
> from disk returned the same result. In one case the file begun with an
> ELF header so i suspect it might have been the inode/directory entry or
> something like that.
>
> Also, I have no problems whatsoever with my memery otherwise, weeks of
> uptime with a heavile loaded system.

Maybe my experience helps finding the problem:

I just had my libX11.so.6 corrypted on 2.1.123 (rh5.1). The file got just
a few bytes longer and random X11 programs would not boot anymore. I first
got a libX11.so.6 not found error while starting an X11 app, after
checking that the file was really there I guessed that it could be a
problem with the kernel (eh, 'experimental'), so I did a reboot, but the
system crashed while shutting down (all I could see is a lot of
"[<09dc90ec>] "-type numbers on the screen). After rebooting, I had to run
e2fsck manually, and then my X server wouldnt allow me to login. Checking
~/.xsession-errors showed me an error about libXpm.so.4.9 not being found.
So I compared libX11.so.6 and libXpm.so.4.9 with those on another system
and found libX11.so.6 corrupted. I overwrote the old one (sorry), so I
have no more traces than this.

This makes me suspect that there may be something that has to do with the
ld.so shared library loader... but I'm no too much of an expert here.

The strange thing is that shared libs are opened for reading only, so
file corruption should not be possible. Maybe it's in the (triton IDE-dma)
block driver.

> > > After reboot there was no serious corruption, e2fsck mentioned
> > > config.h as having a deleted but not cleared inode or some such, sorry
> > > but my memory fails me. Also a bunch of inodes with zero dtime.
> >
> > OK, so the corruption was in cache, but not on disk. Doesn't sound like
> > an ext2 issue: what drivers are you running?
>
> Me not being an fs expert (far from it) I wonder what makes you so sure
> it was in the cache? Config.h was one of the files (Makefile was the
> other, i was editing both) that was 'hit' by the error and fsck found an
> error in that file's inode.
>
> Concerning my drivers, my .config can be found out
> http://replay.linuxpower.org/config (don't want to flood the list).
>
> Any other facts i can give, please ask.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/