Re: CVS/SSH with ext2fs causes serious fs problems

Scott Smyth (smyth@bashful.realminfo.com)
Wed, 14 Oct 1998 10:48:49 -0400 (EDT)


Yes. I knew that more information was needed, but I wanted to
see if there was a response first. More complete answers to
each question are below:

On Wed, 14 Oct 1998 tytso@mit.edu wrote:

> Date: Mon, 12 Oct 1998 11:00:08 +0000 (/etc/localtime)
> From: Scott Smyth <smyth@bashful.realminfo.com>
>
> When cvs using ssh crashes, there are some weird character and
> block device files created that I cannot alter. For example, in
> the CVS directory:
>
> br--r-srwT 1 29282 25972 105, 103 May 20 2031 Entries
>
> What do you mean by "crashes"? Did cvs core dump? Did you get a kernel
> panic? Did the machine lock up?
>

For instance, if I "^C" out of a cvs commit connected via ssh
from a FreeBSD-3.0 box to the Linux cvs server, the result is
one of the /tmp/cvs-serv### directories with block or character
files having the strange characteristics described above with
bizarre permissions. Or the files also occur when the /tmp
directory was already near capacity so cvs could not finish its
function from the client.

What it did not do: cvs did not core dump; no kernel panic; and
the machine did not lock up. Everything accepted the situation
except the X-window from hence I did the cvs function on the
client. The X-window locked up, but quite a few programs make
that happen if the connection fails so that is not surprising to
me.

> This kind of corruption usually means that garbage overwrote part of
> your inode table, which is a pretty bad situation. It usually means a
> kernel bug, or a hardware problem of some kind.
>
> This happened without my paying attention and the root fs needs
> to be cleaned up because now cvs cannot do anything because the
> tmp directory is full. "rm", "chmod", etc... are totally
> useless on these files. fsck did detect problems and I had the
> "autofix (-y)" option on with the fsck run, but although it
> completed happily, did not alter anything so I could clean the
> filesystem.
>
> The latest version of e2fsck (version 1.12) will clear these garbage
> files, but the more important question is why are you seeing these
> corrupted files in the first place? This is indicative of something
> very seriously wrong.
>

I will get the latest version of e2fsck. Thanks for the info.
I am using version 1.10-4 (from rpm) as of this moment. As to
why it happened, I am not sure. I have never seen this before
except when using cvs with ssh after cvs fails for some reason.
If cvs completes its function, there are no bogus /tmp
directories hanging around (I assume cvs cleans them up).

> Can you send more information about exactly what sort of "crashes" you
> are seeing, what version of the kernel you have, what kind of hardware
> you have, etc. That's the sort of information we'd need in order to
> debug this.
>

Yes; I can. I am running 2.0.35 or 2.1.124 (at the time), but
only 2.0.35 was running when the corruption was going on. I
have all kinds of patches in the kernel but limited to
filesystems includes:

ntfs patch (not using);
an overlay filesystem patch (no loaded at time);
hfs (not using);
tcfs (using but not that partition); and
the alpha raid patch (for some but not all times).

I was seeing the corrupted files on my root partition (I have
since put /tmp on its own partition of the main hard disk). The
hard disk the corruption occured on is a Western Digital:

hda: WDC AC32500H, 2441MB w/128kB Cache, CHS=620/128/63, DMA;

as this kernel message indicates. I am using SDRAM and
72-pin SIMMS, but I have seen no other memory problems with this
mix as you do with some other motherboards.

Again, I have only seen this problem using cvs and ssh recently.
I am going to try and make it happen again since /tmp is by
itself now. However, there are no kernel messages or the like
for any other information.

thanks,
Scott

-- 
Scott Smyth, Senior Developer R&D
(770) 446-1332
ssmyth@realminfo.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/