Re: ext3 on raid5 failure

From: Jan Dittmer
Date: Wed Jan 28 2004 - 05:57:08 EST


Theodore Ts'o wrote:
On Fri, Jan 23, 2004 at 09:22:25AM +0100, Jan Dittmer wrote:

Okay, I fscked all filesystems in single user mode, thereby fscked up my root filesystem, though I didn't even check it - so I restored it from backup (grub wouldn't even load anymore).


What messages were printed by e2fsck while it was running --- and was
all of the filesystems unmounted, excepted for the root filesystem,
which should have been mounted read-only?

Yep it was all unmounted. But I don't have a transcript of this session. I was directly logged in, sorry.


After 2 days in my freshly setup debian (2.6.1-bk6), same error. But this time at least I know it's because I tried to delete those files in the lost+found directory...


How did you come to that conclusion?

sfhq:/mnt/data/1/lost+found# ls -l
total 76
d-wSr----- 2 1212680233 136929556 49152 Jun 7 2008 #16370
-rwx-wx--- 1 1628702729 135220664 45056 May 4 1974 #16380
sfhq:/mnt/data/1/lost+found# rm *
rm: cannot remove `#16370': Operation not permitted
rm: cannot remove `#16380': Operation not permitted
sfhq:/mnt/data/1/lost+found# rm *
rm: cannot remove `#16370': Operation not permitted
rm: cannot remove `#16380': Operation not permitted
sfhq:/mnt/data/1/lost+found# rm *
rm: cannot remove `#16370': Operation not permitted
rm: cannot remove `#16380': Operation not permitted
sfhq:/mnt/data/1/lost+found# dmesg
EXT3-fs error (device dm-2): ext3_readdir: directory #16370 contains a hole at offset 8192
Aborting journal on device dm-2.
EXT3-fs error (device dm-2): ext3_readdir: directory #16370 contains a hole at offset 24576
ext3_abort called.
EXT3-fs abort (device dm-2): ext3_journal_start: Detected aborted journal
Remounting filesystem read-only

This is with 2.6.1-bk6 and no concurrent access. With 2.4.25-pre7-pac1 this does not happen.
So I rechecked the filesystem (with 2.4), hoping it won't get corrupted and I would be able to delete the files, but:

sfhq:~# fsck /dev/myraid/lvol1
fsck 1.35-WIP (07-Dec-2003)
e2fsck 1.35-WIP (07-Dec-2003)
/dev/myraid/lvol1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 16370 has an unallocated block #2. Allocate<y>? yes

Directory inode 16370 has an unallocated block #6. Allocate<y>? yes

Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/myraid/lvol1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/myraid/lvol1: 2341/45514752 files (24.0% non-contiguous), 82194611/91002880 blocks

still no luck with undeleting these files

sfhq:/mnt/data/1/lost+found# rm -rf *
rm: cannot remove directory `#16370': Operation not permitted
rm: cannot remove `#16380': Operation not permitted

sfhq:/mnt/data# dumpe2fs /dev/myraid/lvol1
dumpe2fs 1.35-WIP (07-Dec-2003)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: ba3f83cd-04c8-4c4d-82f7-990077aabb73
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal dir_index needs_recovery sparse_super
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 45514752
Block count: 91002880
Reserved block count: 3640115
Free blocks: 8808269
Free inodes: 45512411
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Filesystem created: Thu Jul 3 13:57:02 2003
Last mount time: Wed Jan 28 11:48:13 2004
Last write time: Wed Jan 28 11:48:13 2004
Mount count: 2
Maximum mount count: 23
Last checked: Wed Jan 28 11:47:06 2004
Check interval: 15552000 (6 months)
Next check after: Mon Jul 26 12:47:06 2004
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: a7fd0c53-efe9-4316-9fa6-e3206623f4fe
Journal backup: inode blocks

Do you need the complete output? Btw. I have the same error on another partition (also on raid5, dm).

Thanks,

Jan

Attachment: pgp00000.pgp
Description: PGP signature