Re: rmdir hangs on bad ext2 directory (1.2.11)

Remy CARD (card@excalibur.ibp.fr)
Wed, 12 Jul 1995 20:20:34 +0200 (MET DST)


>
> Just a moment ago I tried to remove a directory which was a file before
> the filesystem corruption (not detected by e2fsck) caused by bad RAM.
>
> The directory disappeared, but rmdir hung, I can't kill it (even -9 doesn't
> work, I guess I will have to reboot), ps shows it in the R state using over
> 90% of CPU time. The following message was found in syslog:
>
> EXT2-fs warning (device 22/1): empty_dir: bad directory (dir 46195)
>
> Maybe this information will help tracking down the problem further. This
> is on the same filesystem which previously caused panic. Now the error
> is detected but seems like it is not handled correctly (the rmdir syscall
> should not hang, but return with some error code instead).

Can you please tell me the version of the kernel that you are
using? Recent versions of the kernel (i.e. Linux 1.3.9) should print
a more precise error message than just ``bad directory''.

> I really hope to see a new e2fsck able to fix this soon - I don't know
> enough about filesystem structure to fix it by hand. Thanks and please
> keep up the good work.

If you know the inode number of the bad directory, you can
removed it by using debugfs. Under debugfs, type ``clri <inode-number>''
and the whole inode will be zeroed. After this, unmount the filesystem,
run e2fsck on it (it should complain about blocks unallocated but marked
as allocated, this is normal), and it should fix the problem.

I think that the next version of the e2fsprogs will contain a
fixed e2fsck (I know how to fix this problem, I just need to find some
free time and I have planned to party next week-end :-).

> Just an idea: how about CRC checksums for inodes? This would allow easy
> detection of such problems, and the hardware Linux is running on is not
> always the highest quality... I can only dream about RAM with ECC, the
> motherboard in this box doesn't even seem to support parity. Single bit
> in an inode can make a big difference, and inodes are updated very often
> compared to files.

Well, this may be a bad thing for performance. Every time the kernel
reads an inode, it should verify the checksum and the checksum should be
recomputed every time a field is changed in the inode. I am afraid that
this would slow down things a lot. Anyway, if your hardware causes data
corruption, there is no reason that this corruption is limited to the inode
table, it can be anywhere in the data blocks, and you can loose your data too.

>
> Regards,
> Marek
>

Remy