Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

From: Michael Eisler (mre@Zambeel.com)
Date: Sat Sep 16 2000 - 11:37:50 EST


> >>>>> " " == Michael Eisler <mre@Zambeel.com> writes:
>
> > Focus on correctness and do the expedient thing first, which
> > is:
> > - The first time a file is locked, flush dirty pages
> > to the server, and then invalidate the page cache
>
> This would be implemented with the last patch I proposed.
>
> > - While the file is locked, do vypass the page cache for all
> > I/O.
>
> This is not possible given the current design of the Linux VFS. The
> design is such that all reads/writes go through the page cache. I'm

I'm not a Linux kernel literate. However, I found your
assertion surprising. Does procfs do page i/o as well?

file.c in fs/nfs suggests that the Linux VFS has non-page interfaces
in addition to page interfaces. fs/read_write.c suggests that the
read and write system calls uses the non-page interface.

I cannot speak for Linux, but System V Release 4 dervied systems
uses the page cache primarily as a tool for each file system, yet
still hide the page interface from the code path leading from the
read/write system calls to the VFS.

> not sure that it is possible to get round this without some major
> changes in VFS philosophy. Hacks such as invalidating the cache after
> each read/write would definitely give rise to races.
>
> As far as I can see, the current use of the page cache should be safe
> as long as applications respect the locking boundaries, and don't
> expect consistency outside locked areas.

Then the code ought to enforce page aligned locks. Of course, while
that will produce correctness, it will violate the principle of
least surprise. It might be better to simply return an error when
a lock operation is attempted on an NFS file. Assuming of course, the
Linux kernel isn't capable of honoring a read() or write() system
whenever the file system doesn't support page-based i/o, which, again,
I'd be surprised by.

        -mre
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:13 EST