NFS lockups+empty directories

From: Jaco Kroon
Date: Thu Jun 24 2004 - 04:55:12 EST


Hello there

I'm having 2 problems with nfs atm, I'll first describe the problem of the mysterious deadlocks.

On the client that has deadlocked I see these (tcpdump host servername -n)
11:37:31.272353 IP 137.215.40.16.3750658894 > 137.215.98.25.2049: 140 read [|nfs]
11:37:32.671933 IP 137.215.40.16.3717104462 > 137.215.98.25.2049: 140 read [|nfs]
11:37:33.371732 IP 137.215.40.16.3733881678 > 137.215.98.25.2049: 140 read [|nfs]

On the server I get lots and lots of these (tcpdump host clientname -n)
11:39:23.239369 IP 137.215.40.16.3733881678 > 137.215.98.25.2049: 140 read [|nfs]
11:37:03.280950 IP 137.215.98.25.2049 > 137.215.40.16.3717104462: reply ok 1472 read [|nfs]
11:37:03.280965 IP 137.215.98.25 > 137.215.40.16: udp
11:37:03.280975 IP 137.215.98.25 > 137.215.40.16: udp
11:37:03.280986 IP 137.215.98.25 > 137.215.40.16: udp
11:37:03.280997 IP 137.215.98.25 > 137.215.40.16: udp
11:37:03.281008 IP 137.215.98.25 > 137.215.40.16: udp

The port numbers are *obviously* wrong, but I suspect this is a tcpdump problem. What worries me is that it doesn't look like the packets from the server reaches the client. Now unless there is someone playing around on the routers (which might well be the case) or the routers is just messed up, then this means that there is a problem in the networking code - unlikely since all other services are still functioning, or I'm doing something wrong. My fstab line looks like:

phugeet.cs.up.ac.za:/home/ /home/phugeet nfs defaults,sync 0 0

Whilst my exports file contains:

/home 137.215.40.19(no_root_squash,rw,sync)

My question then is this - why can't tcpdump get the port numbers on those latter packets? Then also, shouldn't the source port numbers for these also be 2049? Once this starts happening I can't fix anything without rebooting - or I haven't figured out how to yet. Also, I cannot umount the mount, not even with -f (k, correction, at least doing a umount -f causes all "deadlocked" processes to be shot down, after which a second umount -f actually umounts the mount).

Right, now for the disappearing directory entries:

(client):
oasis home # mount
/dev/rd/host0/target0/part1 on / type ext3 (rw,noatime)
none on /dev type devfs (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw)
/dev/rd/host0/target0/part3 on /usr type reiserfs (rw,noatime)
/dev/rd/host0/target0/part5 on /tmp type reiserfs (rw,noatime)
/dev/rd/host0/target0/part6 on /var type reiserfs (rw,noatime)
/dev/rd/host0/target0/part7 on /home type reiserfs (rw,noatime)
none on /dev/shm type tmpfs (rw)
phugeet.cs.up.ac.za:/home/ on /home/phugeet type nfs (rw,sync,addr=137.215.98.25)
oasis home # ll /home/phugeet/courses
total 0

(server):
phugeet home # mount
/dev/hda2 on / type ext3 (rw,noatime)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/bus/usb type usbfs (rw)
phugeet home # ls /home/courses
ACE101 CBD780 COS151 COS226 COS332 DCP780 HONS2003 KVM780 RNW780 Studymaterial index.html
ACM CEN101 COS160 COS283 COS333 EPE121 HONS2004 MIT853 RNW781 UPUV lib
AIP780 COS110 COS213 COS284 COS341 EPE210 INY272 MScPhD SEC780 VRS780 post
ALGON COS130 COS214 COS301 COS343 GPG780 INY326 PGT780 SEC781 WCJ results.php
Announcements COS130-EPE111 COS221 COS314 COS344 GRF780 IRA310 PIN780 SWA780-ERA780 ZZZ111 temp
BTI720 COS130SS COS222-ERB210 COS324 COS444 HONS2002 KMI780 PIN781 SWC780-ERD732 change_filenames.txt visitors.php

The options are the same as above.

Any help, or any other pointers will be much appreciated thanks.

Jaco

--
#ifdef STUPIDLY_TRUST_BROKEN_PCMD_ENA_BIT
2.4.0-test2 /usr/src/linux/drivers/ide/cmd640.c
===========================================
This message and attachments are subject to a disclaimer. Please refer to www.it.up.ac.za/documentation/governance/disclaimer/ for full details.
Hierdie boodskap en aanhangsels is aan 'n vrywaringsklousule onderhewig. Volledige besonderhede is by www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar.
===========================================

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature