Re: rmdir hangs on bad ext2 directory (1.2.11)

Scott Johnson (johnsos@ECE.ORST.EDU)
Sun, 16 Jul 1995 15:56:39 -0700

>On Wed, 12 Jul 1995, Marek Michalkiewicz wrote:

>> Just a moment ago I tried to remove a directory which was a file before
>> the filesystem corruption (not detected by e2fsck) caused by bad RAM.
>> The directory disappeared, but rmdir hung, I can't kill it (even -9 doesn't
>> work, I guess I will have to reboot), ps shows it in the R state using over
>> 90% of CPU time. The following message was found in syslog:
>> EXT2-fs warning (device 22/1): empty_dir: bad directory (dir 46195)

>We've had the exact same thing happen here, running a vanilla 1.3.4
>kernel. What I'd take this to mean is that something between the system
>call and the e2fs code is broken, though I'm not expert on such things.

>I'd also take this to mean it hasn't been fixed yet. :-)

>We were running rmdir on a fairly heavily used drive, mounted as
>/var/spool/news. Very annoying problem. Anyone had this happen with rm, BTW?

>> Just an idea: how about CRC checksums for inodes? This would allow easy

>High overhead?

I've been having a similar problem (rather rarely) on my system as well. I've
got 1.2.10 running, and my entire Linux filesystem is mounted on a single
partition (about 200 Mbytes, I've considered repartitioning to give Linux
more, and OS/2 Warp less space, but thats another story.) The drive is a WD
540 "Caviar" drive, (the device is /dev/hdb5, in case that matters). Every
once in a while, some process will try to access the /var/adm directory, and
for some reason die (enter uninterruptable sleep). When this happens, the
HD makes a strange noise, similar to being powered up for the first time. (My
PC is a desktop, so the HD should not be spinning down for any reason...) It
may be hardware trouble, it may be something Linux is doing, I dunno. At any
rate, ANY process which tries to access this directory (/var/adm) gets put to
sleep. syslogd is usually the first to die, but init soon follows. Any
process which terminates after, instead of dying gracefully, becomes a
zombie. And
shutting down properly with a hung init process is a pain... :) I end up having
to give the computer the One Fingered Salute (shutdown hangs when trying to
kill off these hung processes), and pray when I reboot and run fsck.

As I said, it MIGHT be a HD problem (can anyone reccommend a good utility to
analyze the media of the HD non-destructively? BIOS has a media analysis tool,
but it erases everything). However, if the hardware fails, I don't think the
correct way for Linux to respond is to cause processes to hang. Anyone have
any ideas?