Re: Q: generic_file_write sets PG_locked???

Eric W. Biederman (ebiederm+eric@ccr.net)
24 Apr 1999 09:40:33 -0500


>>>>> "TM" == Trond Myklebust <trond.myklebust@fys.uio.no> writes:

TM> Linus Torvalds <torvalds@transmeta.com> writes:
>> On 23 Apr 1999, Eric W. Biederman wrote:
>> >
>> > LT> In fact, it already does for NFS, which uses PageLocked for exactly that
>> > LT> reason (to serialize read and write requests to the same page).
>> >
>> > As I understand the NFS code it also tries using PageLocked to
>> > synchronize writes _to_ a page from user space.
>>
>> Yes. It also uses it to synchronize various other write events, like the
>> per-page write queue etc.
>>
>> Once that synchronization has been done, the thing is released: even
>> though there is a pending write. It's only a "data coherency lock", it's
>> not necessarily a "IO is pending on this" (you could use it for IO pending
>> too, but that would serialize everything: releasing the lock means that
>> you can gather up multiple write requests up until the time you decide to
>> actually physically push it out the door).

TM> For the write gathering and NFSv3 code, the page lock is also held
TM> during the actual WRITE RPC call. The PG_locked in that case means
TM> that either page is being copied, (and the writeback structures are
TM> being set up) or that IO is being done.

TM> Otherwise it obeys the same semantics as the standard kernel (the lock
TM> is still released when the write is pending but not undergoing IO).

The issue I see with using PageLocked for synchronizing writes to a page
is that it doesn't work with mmaped pages. Consider two different users
on the same machine mmap the same page read/write, read the page so it faults
in. And then write to the mapped page. Later when msync is called or
the swapper kicks in we notice the page is dirty, and write it out.

Basically I contend that using PageLocked as a "data coherency lock"
is broken, at least when using the generic code. Unix doesn't
garantee data coherency, on writes. That's why we have file locks
and append mode. And since the consistency is up to the application,
or up to a specific filesystem, why do we have some of the
implementation in generic code (particularly generic_file_write)?

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/