Re: Data loss on NFS mounted directory (2.2.13)

Jens Benecke (jens@pinguin.conetix.de)
Fri, 5 Nov 1999 18:16:09 +0100


On Fri, Nov 05, 1999 at 03:41:00PM +0100, Trond Myklebust wrote:

>>> > When the download was about half finished, I 'mv'ed the file
>>> > into another directory - same NFS filesystem, same partition on
>>> > server, just a subdirectory. No problem, I thought, this has
>>> > always worked on all machines I know.
>>> Could you run a 'tcpdump -s 256' before and after the 'mv', so
>>> that we can capture a write or two? I have a feeling this is a
>>> server bug (BTW
> > Yes. That's just about what I wanted to hear: how to trace the
> > issue. (Note that I'm not calling it a bug yet...)
> The dump you provided of activity after the 'mv' seemed to be missing
> 'write' calls. Is this an accident, or do the 'write' really stop going
> down the wire?

I copy&paste'd it from an xterm. So, the write calls you mention were not
displayed by tcpdump.

> If the latter is the case, it would be useful to see what goes on exactly
> at the moment around the 'mv' itself. I suspect that the server is
> stating that the file handle is stale: if we can capture the first
> 'write' (and the server reply) after the mv, that should tell us whether
> this is indeed happening or not.

I tried this again today. With nfs_debug = 9 (as you told me). And it
works perfectly. Same kernel, same machine, same setup. We have a
schroedingbug here, methinks: Once you track it, it dissapears.

> Note: you might also want to look at the debugging output from 'echo "9"
> > /proc/sys/sunrpc/nfs_debug' during the 'mv'.

This should be the interesting part:

nfs: write(jens/so51a_os2_49.exe(2047449815), 4096@491520)
NFS: nfs_updatepage(jens/so51a_os2_49.exe 4096@491520, sync=0)
NFS: find_write_request(3/2047449815, c02f5850)
NFS: create_write_request(jens/so51a_os2_49.exe, 491520+4096)
NFS: append_write_request(c14ac1ec, c2fcf360)
NFS: 3141 schedule_write_request (async)
NFS: 3140 nfs_wback_begin (jens/so51a_os2_49.exe, status=0 flags=0)
NFS: 3140 nfs_wback_result (jens/so51a_os2_49.exe, status=0, flags=100)
NFS: refresh_inode(3/2047449815 ct=1)
NFS: remove_write_request(c14ac1ec, c2fcfb40)
NFS: nfs_readdir(//jens)
last message repeated 3 times
NFS: refresh_inode(3/2048665601 ct=1)
NFS: refresh_inode(3/2048358438 ct=1)
NFS: refresh_inode(3/2047539207 ct=1)
NFS: dentry_delete(Private.16/StarOffice, 0)
NFS: refresh_inode(3/2047449815 ct=1)
NFS: lookup(StarOffice/so51a_os2_49.exe)
NFS: dentry_delete(StarOffice/so51a_os2_49.exe, 0)
NFS: refresh_inode(3/2047449815 ct=1)
NFS: rename(jens/so51a_os2_49.exe -> StarOffice/so51a_os2_49.exe, ct=1)
NFS: 3141 nfs_wback_begin (jens/so51a_os2_49.exe, status=0 flags=0)
NFS: 3141 nfs_wback_result (jens/so51a_os2_49.exe, status=0, flags=100)
NFS: refresh_inode(3/2047449815 ct=1)
NFS: remove_write_request(c14ac1ec, c2fcf360)
NFS: dentry_delete(jens/so51a_os2_49.exe, 0)
nfs: write(StarOffice/so51a_os2_49.exe(2047449815), 4096@495616)
NFS: nfs_updatepage(StarOffice/so51a_os2_49.exe 4096@495616, sync=0)
NFS: find_write_request(3/2047449815, c02f5378)
NFS: create_write_request(StarOffice/so51a_os2_49.exe, 495616+4096)
NFS: append_write_request(c14ac1ec, c2fcf360)
NFS: 3152 schedule_write_request (async)
nfs: write(StarOffice/so51a_os2_49.exe(2047449815), 4096@499712)
NFS: nfs_updatepage(StarOffice/so51a_os2_49.exe 4096@499712, sync=0)
NFS: find_write_request(3/2047449815, c0307f28)
NFS: create_write_request(StarOffice/so51a_os2_49.exe, 499712+4096)
NFS: append_write_request(c14ac1ec, c2fcfb40)
NFS: 3153 schedule_write_request (async)
NFS: 3152 nfs_wback_begin (StarOffice/so51a_os2_49.exe, status=0 flags=0)
NFS: 3152 nfs_wback_result (StarOffice/so51a_os2_49.exe, status=-116, flags=100
NFS: remove_write_request(c14ac1ec, c2fcf360)

The "trouble" is, _now_ it works.

Thanks for your help anyway. =;) For next time, I know what to do. :)

-- 
_ciao, Jens_______________________________ http://www.pinguin.conetix.de

Windows NT indeed has very low Total Cost of Ownership. Trouble is, Microsoft _owns_ Windows NT. You just licensed it.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/