Re: [NFS] blocks of zeros (NULLs) in NFS files in kernels >= 2.6.20

From: Aaron Straus
Date: Fri Sep 05 2008 - 18:14:51 EST


Hi,

On Sep 05 04:36 PM, Chuck Lever wrote:
> I have the latest Fedora 9 kernels on two clients, mounting via NFSv3
> using "actimeo=600" (for other reasons). The server is OpenSolaris
> 2008.5.
>
> reader.py reported zeroes in the test file after about 5 minutes.

Awesome. Thanks for testing! Our actime is much shorter which is
probably why it happens sooner for us.

> Looking at the file a little later, I don't see any problems with it.
>
> Since your scripts are not using any kind of serialization (ie file
> locking) between the clients, I wonder if non-determinant behavior is
> to be expected.

Hmm... yep. I don't know what guarantees we want to make. The
behavior doesn't seem to be consistent with older kernels though... so
I'm thinking it might be a bug.

We hit this particular issue because we have scripts which essentially
'tail -f' log files looking for errors. They miss log messages (and
see corrupted ones) b/c of the NULLs. That's also why there is no
serialization.... we don't need it when grep'ing through log messages.

I'm bisecting now. I see a block of intricate-looking NFS patches, I'll
try to narrow it down to a particular commit.

I'll also get the wireshark data at that point.

Thanks,
=a=


--
===================
Aaron Straus
aaron@xxxxxxxxxxxxx

Attachment: signature.asc
Description: Digital signature