Re: nfs: infinite loop in fcntl(F_SETLKW)

From: Miklos Szeredi
Date: Mon Apr 14 2008 - 17:16:32 EST


> On Sun, Apr 13, 2008 at 10:28:08AM +0200, Miklos Szeredi wrote:
> > > Apologies, that was indeed a behavioral change introduced in a commit
> > > that claimed just to be shuffling code around.
> >
> > Another complaint about this series: using EINPROGRESS to signal
> > asynchronous locking looks really fishy. How does the filesystem
> > know, that the caller wants to do async locking?
>
> The caller sets a fl_grant callback. But I guess the interesting
> question is how the caller knows that the filesystem is really going to
> return the results asynchronously:
>
> > How do we make sure,
> > that the filesystem (like fuse or 9p, which "blindly" return the error
> > from the server) doesn't return EINPROGRESS even when it's _not_ doing
> > an asynchronous lock?
>
> Right, there's no safeguard there--if fuse returns EINPROGRESS, then
> we'll wait for a grant callback that's not going to come. It should
> time out, so that's not a total disaster, but still.
>
> Anyway, I'm not sure what to do about that.
>
> >
> > I think it would have been much cleaner to have a completely separate
> > interface for async locking, instead of trying to cram that into
> > f_op->lock().
>
> Maybe so. ->lock() had quite a bit crammed into it even before this.

Oh yeah, but at least that was 1:1 with the fcntl interface.

> > Would that be possible to fix now? Or at least make EINPROGRESS a
> > kernel-internal error value (>512), to make it that it has a special
> > meaning for the _kernel only_?
>
> Perhaps so.
>
> The behavior of lockd will still depend to some degree on the exact
> error returned from the filesystem--e.g. if you return -EAGAIN from
> ->lock() without later calling ->fl_grant() it will cause some
> unexpected delays. (Though again clients will eventually give up and
> poll for the lock.)

OK, so the semantics of vfs_lock_file() are now:

1) if !FL_SLEEP, then return 0 if granted, -EAGAIN on contention
2) if FL_SLEEP and fl_grant == NULL, then return 0 if granted, block on
contention
3) if FL_SLEEP and fl_grant != NULL, then return 0 if granted, on
contention:
a) either return -EINPROGRESS and call fl_grant when granted
b) or return -EAGAIN and call fl_notify when the lock needs retrying

As a first step, how about doing the following:

For 3) use LOCK_FILE_ASYNC as a return value. AFAICS there's no
reason to distinguish between a) and b). For 1) leave the -EAGAIN
error, since in that case no lock was queued.

Here's an untested patch. Comments?

Miklos


---
fs/gfs2/locking/dlm/plock.c | 2 +-
fs/lockd/svclock.c | 13 ++++---------
fs/locks.c | 20 ++++++++++----------
include/linux/fs.h | 15 +++++++++++++++
4 files changed, 30 insertions(+), 20 deletions(-)

Index: linux-2.6/fs/locks.c
===================================================================
--- linux-2.6.orig/fs/locks.c 2008-04-14 22:19:57.000000000 +0200
+++ linux-2.6/fs/locks.c 2008-04-14 22:52:06.000000000 +0200
@@ -784,8 +784,10 @@ find_conflict:
if (!flock_locks_conflict(request, fl))
continue;
error = -EAGAIN;
- if (request->fl_flags & FL_SLEEP)
- locks_insert_block(fl, request);
+ if (!(request->fl_flags & FL_SLEEP))
+ goto out;
+ error = FILE_LOCK_ASYNC;
+ locks_insert_block(fl, request);
goto out;
}
if (request->fl_flags & FL_ACCESS)
@@ -841,7 +843,7 @@ static int __posix_lock_file(struct inod
error = -EDEADLK;
if (posix_locks_deadlock(request, fl))
goto out;
- error = -EAGAIN;
+ error = FILE_LOCK_ASYNC;
locks_insert_block(fl, request);
goto out;
}
@@ -1040,7 +1042,7 @@ int posix_lock_file_wait(struct file *fi
might_sleep ();
for (;;) {
error = posix_lock_file(filp, fl, NULL);
- if ((error != -EAGAIN) || !(fl->fl_flags & FL_SLEEP))
+ if (error != FILE_LOCK_ASYNC)
break;
error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
if (!error)
@@ -1112,9 +1114,7 @@ int locks_mandatory_area(int read_write,

for (;;) {
error = __posix_lock_file(inode, &fl, NULL);
- if (error != -EAGAIN)
- break;
- if (!(fl.fl_flags & FL_SLEEP))
+ if (error != FILE_LOCK_ASYNC)
break;
error = wait_event_interruptible(fl.fl_wait, !fl.fl_next);
if (!error) {
@@ -1534,7 +1534,7 @@ int flock_lock_file_wait(struct file *fi
might_sleep();
for (;;) {
error = flock_lock_file(filp, fl);
- if ((error != -EAGAIN) || !(fl->fl_flags & FL_SLEEP))
+ if (error != FILE_LOCK_ASYNC)
break;
error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
if (!error)
@@ -1806,7 +1806,7 @@ again:
else {
for (;;) {
error = posix_lock_file(filp, file_lock, NULL);
- if (error != -EAGAIN || cmd == F_SETLK)
+ if (error != FILE_LOCK_ASYNC)
break;
error = wait_event_interruptible(file_lock->fl_wait,
!file_lock->fl_next);
@@ -1934,7 +1934,7 @@ again:
else {
for (;;) {
error = posix_lock_file(filp, file_lock, NULL);
- if (error != -EAGAIN || cmd == F_SETLK64)
+ if (error != FILE_LOCK_ASYNC)
break;
error = wait_event_interruptible(file_lock->fl_wait,
!file_lock->fl_next);
Index: linux-2.6/fs/gfs2/locking/dlm/plock.c
===================================================================
--- linux-2.6.orig/fs/gfs2/locking/dlm/plock.c 2008-04-09 16:44:15.000000000 +0200
+++ linux-2.6/fs/gfs2/locking/dlm/plock.c 2008-04-14 22:54:05.000000000 +0200
@@ -108,7 +108,7 @@ int gdlm_plock(void *lockspace, struct l
if (xop->callback == NULL)
wait_event(recv_wq, (op->done != 0));
else
- return -EINPROGRESS;
+ return FILE_LOCK_ASYNC;

spin_lock(&ops_lock);
if (!list_empty(&op->list)) {
Index: linux-2.6/fs/lockd/svclock.c
===================================================================
--- linux-2.6.orig/fs/lockd/svclock.c 2008-04-09 16:44:18.000000000 +0200
+++ linux-2.6/fs/lockd/svclock.c 2008-04-14 22:55:20.000000000 +0200
@@ -423,8 +423,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
goto out;
case -EAGAIN:
ret = nlm_lck_denied;
- break;
- case -EINPROGRESS:
+ goto out;
+ case FILE_LOCK_ASYNC:
if (wait)
break;
/* Filesystem lock operation is in progress
@@ -439,10 +439,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
goto out;
}

- ret = nlm_lck_denied;
- if (!wait)
- goto out;
-
ret = nlm_lck_blocked;

/* Append to list of blocked */
@@ -520,7 +516,7 @@ nlmsvc_testlock(struct svc_rqst *rqstp,
}

error = vfs_test_lock(file->f_file, &lock->fl);
- if (error == -EINPROGRESS) {
+ if (error == FILE_LOCK_ASYNC) {
ret = nlmsvc_defer_lock_rqst(rqstp, block);
goto out;
}
@@ -744,8 +740,7 @@ nlmsvc_grant_blocked(struct nlm_block *b
switch (error) {
case 0:
break;
- case -EAGAIN:
- case -EINPROGRESS:
+ case FILE_LOCK_ASYNC:
dprintk("lockd: lock still blocked error %d\n", error);
nlmsvc_insert_block(block, NLM_NEVER);
nlmsvc_release_block(block);
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h 2008-04-09 16:44:54.000000000 +0200
+++ linux-2.6/include/linux/fs.h 2008-04-14 23:03:03.000000000 +0200
@@ -837,6 +837,21 @@ extern spinlock_t files_lock;
#define FL_SLEEP 128 /* A blocking lock */

/*
+ * FILE_LOCK_ASYNC:
+ *
+ * Special return value from posix_lock_file() and vfs_lock_file()
+ * for asynchronous locking.
+ *
+ * For posix_lock_file() it means, that the lock has been queued on
+ * the block list.
+ *
+ * For vfs_lock_file() it means, that the filesystem will call back
+ * either fl_notify() or fl_granted() when the lock needs to be
+ * retried or if it has been granted.
+ */
+#define FILE_LOCK_ASYNC 1
+
+/*
* The POSIX file lock owner is determined by
* the "struct files_struct" in the thread group
* (or NULL for no owner - BSD locks).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/