NFS probs w/ NetApp

From: Andrew Morton (morton@nortelnetworks.com)
Date: Wed Mar 15 2000 - 06:09:48 EST


We're having some problems talking NFS to a NetApp server (kernel 2.2.5 and 2.2.13). I'd appreciate it if an NFS guru could help out with some trace interpretation.

I'm also wondering if there's a timeout handling buglet in the kernel.

Mount options (autofs):

/automounts/netapp /etc/auto.netapp -soft,rw,bg,intr,rsize=8192,wsize=8192,retry=5,retrans=130,timeo=14,noquota,nosuid

tcpdump output for a large read (zwoln001 is the netapp):

18:14:38.420863 zwoln001 > pwold011: (frag 63753:1244@2960)
18:14:38.421020 pwold011.175244598 > zwoln001.nfs: 152 read [|nfs]
18:14:38.422770 zwoln001.nfs > pwold011.175244598: reply ok 1472 read [|nfs] (frag 64009:1480@0+)
18:14:38.424002 zwoln001 > pwold011: (frag 64009:1480@1480+)
18:14:38.425041 zwoln001 > pwold011: (frag 64009:1244@2960)
18:14:38.436889 pwold011.192021814 > zwoln001.nfs: 152 read [|nfs]
18:14:38.438659 zwoln001.nfs > pwold011.192021814: reply ok 1472 read [|nfs] (frag 64265:1480@0+)
18:14:39.830059 pwold011.192021814 > zwoln001.nfs: 152 read [|nfs]
>>>> [ timeout ]
18:14:42.630092 pwold011.192021814 > zwoln001.nfs: 152 read [|nfs]
18:14:42.631824 zwoln001.nfs > pwold011.192021814: reply ok 1472 read [|nfs] (frag 64521:1480@0+)
18:14:42.633054 zwoln001 > pwold011: (frag 64521:1480@1480+)
18:14:42.634094 zwoln001 > pwold011: (frag 64521:1244@2960)

...

18:14:42.793453 zwoln001.nfs > pwold011.728892726: reply ok 1472 read [|nfs] (frag 7178:1480@0+)
18:14:42.794684 zwoln001 > pwold011: (frag 7178:1480@1480+)
18:14:42.795723 zwoln001 > pwold011: (frag 7178:1244@2960)
18:14:42.795877 pwold011.745669942 > zwoln001.nfs: 152 read [|nfs]
18:14:42.797627 zwoln001.nfs > pwold011.745669942: reply ok 1472 read [|nfs] (frag 7434:1480@0+)
18:14:42.798858 zwoln001 > pwold011: (frag 7434:1480@1480+)
18:14:42.799898 zwoln001 > pwold011: (frag 7434:1244@2960)
18:14:42.801114 pwold011.762447158 > zwoln001.nfs: 152 read [|nfs]
18:14:42.803039 zwoln001 > pwold011: (frag 7690:1480@1480+)
18:14:42.804079 zwoln001 > pwold011: (frag 7690:1244@2960)
>>>> [another timeout]
18:14:44.200164 pwold011.762447158 > zwoln001.nfs: 152 read [|nfs]
18:14:44.201945 zwoln001.nfs > pwold011.762447158: reply ok 1472 read [|nfs] (frag 7946:1480@0+)
18:14:44.203175 zwoln001 > pwold011: (frag 7946:1480@1480+)
18:14:44.204214 zwoln001 > pwold011: (frag 7946:1244@2960)
18:14:44.205896 pwold011.779224374 > zwoln001.nfs: 152 read [|nfs]

I assume the first timeout is simply due to the read request getting lost - could be n/w or server probs. Correct?

But the second retry at 18:14:44.200164 looks odd because the read seems to have been satisfied.

A couple of questions:

- How does one enable the 'chatty' mode on the client so we get some syslog output?

- Why are these retries only coming out after 2.8 seconds, when the 'timeo' option was set to 1.4 seconds?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:17 EST