Re: 'NFS stale file handle' with 2.5

From: Jan Dittmer (j.dittmer@portrix.net)
Date: Tue Jul 15 2003 - 10:20:13 EST


Neil Brown wrote:
> On Saturday July 12, j.dittmer@portrix.net wrote:
>
>>Hi,
>>
>>I'm experiencing really big problems with nfs on 2.5 - and I'm a bit
>>stuck debugging.
>>
>
> This makes me a bit suspicious of hardware, probably networking. It
> really looks like data is getting corrupted between client and server.
>

Ok, to rule out hardware I mounted the filesystem on the same machine I
exported it. So no real network hardware is involved only lo.
I ran nhfsstone (which btw is a great tool to reproduce it, just type nhfsrun)
and got this nearly immediatly:

17:16:32.706611 localhost.3772543495 > localhost.nfs: 148 remove fh Unknown/1
"012abcdefghijklmn" (DF) [ttl 0]
17:16:32.707305 localhost.3789320711 > localhost.nfs: 116 fsstat fh Unknown/1
(DF) [ttl 0]
17:16:32.707602 localhost.3806097927 > localhost.nfs: 136 access fh Unknown/1
000d (DF) [ttl 0]
17:16:32.709633 localhost.3822875143 > localhost.nfs: 116 fsstat fh Unknown/1
(DF) [ttl 0]
17:16:32.709822 localhost.3839652359 > localhost.nfs: 144 commit fh Unknown/1
2048 bytes @ 0x000000000 (DF) [ttl 0]
17:16:32.725623 localhost.3806097927 > localhost.nfs: 136 access fh Unknown/1
000d (DF) [ttl 0]
17:16:32.733637 localhost.nfs > localhost.3772543495: reply ok 120 remove (DF)
[ttl 0]
17:16:32.735074 localhost.nfs > localhost.3789320711: reply ok 84 fsstat
tbytes 0x189b948000 fbytes 0x173616c000 abytes 0x15f616c000 (DF) [ttl
  0]
17:16:32.743608 localhost.3806097927 > localhost.nfs: 136 access fh Unknown/1
000d (DF) [ttl 0]
17:16:32.750618 localhost.3839652359 > localhost.nfs: 144 commit fh Unknown/1
2048 bytes @ 0x000000000 (DF) [ttl 0]
17:16:32.756246 localhost.nfs > localhost.3302781447: reply ok 144 setattr
(DF) [ttl 0]
17:16:32.756467 localhost.nfs > localhost.3806097927: reply ok 120 access c
000d (DF) [ttl 0]
17:16:32.759073 localhost.3856429575 > localhost.nfs: 168 setattr fh Unknown/1
(DF) [ttl 0]
17:16:32.759632 localhost.nfs > localhost.3822875143: reply ok 84 fsstat
tbytes 0x189b948000 fbytes 0x1736172000 abytes 0x15f6172000 (DF) [ttl
  0]
17:16:32.759815 localhost.3873206791 > localhost.nfs: 148 lookup fh Unknown/1
"012abcdefghijklmn" (DF) [ttl 0]
17:16:32.760182 localhost.nfs > localhost.3839652359: reply ok 128 commit (DF)
[ttl 0]
17:16:32.760574 localhost.3889984007 > localhost.nfs: 148 lookup fh Unknown/1
"032abcdefghijklmn" (DF) [ttl 0]
17:16:32.760703 localhost.nfs > localhost.3806097927: reply ok 120 access c
000d (DF) [ttl 0]
17:16:32.760942 localhost.3906761223 > localhost.nfs: 148 lookup fh Unknown/1
"007abcdefghijklmn" (DF) [ttl 0]
17:16:32.761071 localhost.nfs > localhost.3806097927: reply ok 120 access c
000d (DF) [ttl 0]
17:16:32.769785 localhost.3873206791 > localhost.nfs: 148 lookup fh Unknown/1
"012abcdefghijklmn" (DF) [ttl 0]
17:16:32.778592 localhost.3889984007 > localhost.nfs: 148 lookup fh Unknown/1
"032abcdefghijklmn" (DF) [ttl 0]
17:16:32.797611 localhost.3906761223 > localhost.nfs: 148 lookup fh Unknown/1
"007abcdefghijklmn" (DF) [ttl 0]
17:16:32.830371 localhost.nfs > localhost.3839652359: reply ok 128 commit (DF)
[ttl 0]
17:16:32.831099 localhost.nfs > localhost.3856429575: reply ok 36 setattr
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.831728 localhost.nfs > localhost.3873206791: reply ok 32 lookup
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.832699 localhost.nfs > localhost.3889984007: reply ok 32 lookup
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.832862 localhost.3923538439 > localhost.nfs: 132 getattr fh Unknown/1
(DF) [ttl 0]
17:16:32.833612 localhost.nfs > localhost.3906761223: reply ok 32 lookup
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.834058 localhost.nfs > localhost.3873206791: reply ok 32 lookup
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.834416 localhost.3940315655 > localhost.nfs: 164 setattr fh Unknown/1
(DF) [ttl 0]
17:16:32.834940 localhost.nfs > localhost.3889984007: reply ok 32 lookup
ERROR: Stale NFS file handle (DF) [ttl 0]
17:16:32.835896 localhost.nfs > localhost.3906761223: reply ok 236 lookup fh
Unknown/1 (DF) [ttl 0]
17:16:32.836910 localhost.nfs > localhost.3923538439: reply ok 112 getattr REG
100666 ids 1000/1000 sz 0x000000000 (DF) [ttl 0]
17:16:32.837481 localhost.nfs > localhost.3940315655: reply ok 144 setattr
(DF) [ttl 0]
17:16:32.841279 localhost.3957092871 > localhost.nfs: 132 getattr fh Unknown/1
(DF) [ttl 0]
17:16:32.843429 localhost.nfs > localhost.3957092871: reply ok 112 getattr REG
100666 ids 1000/1000 sz 0x000000000 (DF) [ttl 0]
17:16:32.844561 localhost.3973870087 > localhost.nfs: 132 getattr fh Unknown/1
(DF) [ttl 0]
17:16:32.844918 localhost.3990647303 > localhost.nfs: 148 lookup fh Unknown/1
"012abcdefghijklmn" (DF) [ttl 0]

I'll try to reproduce this on my other (UP) machines.

Thanks,

Jan

-- 
Linux rubicon 2.6.0-test1-jd2 #1 SMP Mon Jul 14 17:37:41 CEST 2003 i686

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jul 15 2003 - 22:00:58 EST