Re: 2.2.14 knfsd oops in nfssvc_encode_attrstat

From: Neil Brown (neilb@cse.unsw.edu.au)
Date: Thu Mar 02 2000 - 19:18:22 EST


On Thursday March 2, berend@growthnetworks.com wrote:
> We have a Linux server running 2.2.14 version of the kernel that oopses
> within a minute of starting NFS (version 2) exports (with the latest NFS
> patches applied, although the problem happens just the same with and
> without the patches, as well as with the Redhat 6.1 shipped 2.2.12
> kernel).

This is all very odd (but things always are until you find the
explanation, and then they are crystal clear).

If the resp filehandle doesn't have an inode, then something must have
failed earlier, and so an error should have been returned from the nfs
procedure handler, and so nfsd_dispatch should not have called the
encode function at all.

Could you gets some more data by:

   echo 255 > /proc/sys/sunrpc/nfsd_debug

and then getting it to fail again please.

NeilBrown

>
> Here's the oops dump:
>
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000008
> current->tss.cr3 = 00101000, %cr3 = 00101000
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[nfssvc_encode_attrstat+26/376]
> EFLAGS: 00010282
> eax: 00000000 ebx: ce10da00 ecx: 00000000 edx: ce17c01c
> esi: ce10da00 edi: ce17c014 ebp: ce10da00 esp: ce17bf68
> ds: 0018 es: 0018 ss: 0018
> Process nfsd (pid: 807, process nr: 30, stackpage=ce17b000)
> Stack: ce17c014 c0146307 ce10da00 ce17c01c ce3b57e0 ce10daf4 c01f8a60
> ce3b5844
> c017efe4 ce10da00 ce17c014 ce17a000 00000000 ce10da00 cfcffd40
> ce3b57e0
> c01f8e00 ce10da00 00000002 000186a3 00000002 ce17c014 c01f88ec
> 00000000
> Call Trace: [nfsd_dispatch+283/360] [svc_process+692/1316]
> [nfsd+213/444] [kernel_thread+35/48]
> Code: 8b 58 08 85 db 75 07 31 d2 e9 fd 00 00 00 66 8b 43 22 66 c1
>
> The problems seems to be that nfssvc_encode_attrstat gets called with a
> NULL resp->fh.fh_dentry, causing the d_inode dereference the cause a
> crash.
>
> For now I have temporarily worked around the problem by putting in a
> check for a NULL resp->fh.fh_dentry, but surely the problem is upstream
> from the nfssvc_encode_attrstat problem. Any ideas?
>
> The environment in which this happens has a lot of Solaris 2.6/2.7
> clients accessing the Linux NFS server. There are a couple of
> applications that make (light) use of file locks, if that's relevant.
>
> Thanks,
>
> Berend
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:13 EST