ext2 fs corruption in 2.1.131+DAC960-2.1-beta3+knfsd-981122

David Mansfield (david@cobite.com)
Fri, 4 Dec 1998 12:54:17 -0500 (EST)


This is not a success story :-(

I am running the above kernel compiled UP w/gcc 2.7.2.3 on a PII 450,
256MB ram, basically RedHat 5.1 (actually it's a VAResearch box). I have
been trying to determine the stability of the current kernel dev. for
being an NFS server. I have managed to get these errors:

EXT2-fs error (device 30:04): ext2_check_blocks_bitmap:
Wrong free blocks count for group 83, stored = 1551, counted = 1036
EXT2-fs error (device 30:04): ext2_new_block:
Free blocks count corrupted for block group 83

The fs in question is on a RAID-5 logical volume on the DAC960PG, and is
basically empty, and about 20GB. Nothing has touched this fs (since I got
the machine) except bonnie, and bonnie from a remote NFS client.

See also a previous post about 'knfsd strangeness' from yesterday. Here
are the things I have been doing (not all simultaneous, but some :-):

bonnie -s 1000m on the ext2 fs in question
bonnie -s 500m x5 simultaneous
Mount fs from linux 2.0.35 client and bonnie -s 100m
Mount fs from solaris 2.5 client and bonnie -s 100m
Mount fs from solaris 2.6 client and bonnie -s 100m
Mount fs using NFS from local machine (loopback NFS?) and bonnie -s 500m
make -j 100 zImage (not on this fs but simultaneous to bonnie x5)

Other errors in syslog include:

<cut>
nfs: server spike not responding, still trying
nfs: server spike not responding, still trying
nfs: task 1078 can't get a request slot
nfs: task 1079 can't get a request slot
nfs: server spike OK
</cut>

same as above with lots of different task #'s

<cut>
mountd[624]: authenticated mount request from marvin.cobite.com:884
kernel: nfsd: connect from unprivileged port: cf8e8816:36381<4>nfsd:
accept failed (err 11)!
kernel: nfsd: accept failed (err 11)!
kernel: svc: unknown program 100227 (me 100003)
</cut>

(marvin is solaris 2.6)

After unmounting, fsck went through cleanly until 'fix summary info' then:

IMMENSE #'s of messages like '-754395 -754396 -754397 -754398 -754399'.
Tens of thousands. And then stuff like:

Free blocks count wrong for group 115 (0, counted=7854). FIXED
Free blocks count wrong for group 116 (0, counted=7854). FIXED
Free blocks count wrong for group 117 (0, counted=7854). FIXED
Free blocks count wrong for group 118 (0, counted=7854). FIXED

Free blocks count wrong (19069902, counted=19562827). FIXED
Inode bitmap differences: -30723. FIXED
Free inodes count wrong for group #15 (2045, counted=2046). FIXED
Free inodes count wrong (5119769, counted=5119770). FIXED

Thanks for listening.
David

-- 
/==============================\
| David Mansfield              |
| david@cobite.com             |
\==============================/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/