nfsv3 lockd causes kernel oops

From: Trond Myklebust (trond.myklebust@fys.uio.no)
Date: Tue Sep 19 2000 - 08:58:49 EST


Hi,

  Ever since the introduction of the new file locking code in 2.4.x,
NFS locking has been seriously broken. A preliminary look seems to
indicate that the problems might lie in the generic locking code: when
running the connectathon tests against another server I get crap like

      panic("Attempting to free lock with active wait queue")

This occurs regularly when running the various child process/parent
process locking tests (essentially those tests which activate the lock
blocking).

  This needs to be fixed before we release the final 2.4.0. I'll see
if I can't find time to look into it myself in the course of the week,
however I'm trying hard not to get too distracted from writing up my
thesis. It would therefore be nice if Willy and/or others familiar
with the file locking could take a look too.

  In the meantime, perhaps Ted could put this issue into his TODO
list?

Cheers,
  Trond

>>>>> " " == Christian Brandes <brandes@caxopen.de> writes:

> Hi!

> I've been trying to setup an NFSv3 server that supports
> nfs_lock.

> So I installed RedHat 7.0beta (6.9.5) and built a 2.4.0-test7
> kernel. During boot up I noticed that the launching of
> nfslockd did not succed, might be that I do not have the right
> nfs-tools. I use the ones that are provided with the distrib
> (nfs-utils-0.1.9.1-5).

> I did try out what happens when I lock a file by nfs. I started
> my little test program twice on a client machine:

> #include <stdio.h> include <errno.h> include <stdlib.h> include
> #<sys/file.h> include <sys/types.h> include <sys/stat.h>
> #include <fcntl.h> include <unistd.h>

> int main (void) {
> int fd; int rc; printf ("\n"); fd = open("./t1", O_RDWR); rc
> = flock(fd, LOCK_EX ); printf ("rc: %d\n", rc); sleep (30);
> printf ("exit\n");
> }

> And the result was a kernel Oops on the server:

> locks_insert_block: removing duplicated lock (pid=59714
> 0-9223372036854775807 type=1) Unable to handle kernel NULL
> pointer dereference at virtual address 00000004
> printing eip:
> c013b88e *pde = 00000000 Oops: 0002 CPU: 0 EIP:
> 0010:[locks_delete_block+14/48] EFLAGS: 00010286 eax: c586f86c
> ebx: 00000000 ecx: c586f878 edx: 00000000 esi: c586f878 edi:
> cfe1a9f0 ebp: cfe1a9f0 esp: c581def8 ds: 0018 es: 0018 ss: 0018
> Process lockd (pid: 2368, stackpage=c581d000) Stack: c586f86c
> c013b8e7 c586f86c c0224e60 0000e942 00000000 00000000 ffffffff
> 7fffffff 00000001 cfcbcc08 cfcbc81c c586f800 c013ce4d
> cfe1a9f0 c586f86c c016d7cd cfe1a9f0 c586f86c c586f800
> c0171c69 c5983400 c581df58 cfcbc7d0
> Call Trace: [locks_insert_block+55/112] [tvecs+18936/114884]
> [posix_block_lock+13/16] [nlmsvc_lock+685/720]
> [nlm4svc_retrieve_args+169/240] [nlm4svc_proc_lock+149/224]
> [svc_process+729/1248]
> [lockd+423/576] [lockd+0/576] [kernel_thread+43/64]
> Code: 89 5a 04 89 13 89 48 0c 89 48 10 8d 48 04 8b 59 04 8b 50
> 04

> I would be glad if you could tell me what went wrong and how I
> could solve the problem.

> Thanx a lot

> Christian Brandes

> ----------------------------------------------
> C H R I S T I A N B R A N D E S
> CAxOPEN product development technology GmbH
> Sauerwiesen 2, D-67661 Kaiserslautern
> E-Mail: brandes@caxopen.de
> Tel.: +49 6301 70919 0 Fax: +49 6301 70919 59
> ----------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:20 EST