Re: [RFC] [PATCH] Dirty pages in the page cache

Stephen C. Tweedie (sct@dcs.ed.ac.uk)
Mon, 12 Jan 1998 22:37:07 GMT


Hi,

On 11 Jan 1998 21:11:09 -0600, ebiederm+eric@npwt.net (Eric
W. Biederman) said:

> What follows is a patch to allow dirty pages in the page cache.

> It uses the `writepage' function of an inode, to write out dirty pages
> from the page cache. I have a filesystem that currently uses this and
> shows performace comparable with ext2.

> The basic idea of use is a filesystem will set an extra dirty on a
> page, when a write occurs to it. And if the dirty bit is set it is
> garanteed that before the page is removed from memory `writepage' will
> be called. For filesystems that need more precise tracking they
> should be able to do that on their own.

One big question with this: _should_ this sort of thing be done directly
in the page cache? Most filesystems will require extra information in
order to keep their current write semantics. Ext2fs will require
information about which blocks within the buffer are dirty, especially
for short files on 1k block filesystems. NFS will require credential
information.

An alternative way which would certainly make it simpler to maintain
ext2fs semantics with minimal effort would be to make the ext2 write
code just a little bit smarter than it is: instead of copying user data
into buffers and doing a separate vm update, it could overlay the
required buffers on top of the page cache, sharing the physical page,
and let the existing bdflush write out the pages eventually.

The struct buffer_head cache is a fairly natural place to keep
information about physically dirty block device blocks, but the data
itself could just as easily stay in the page cache. It would be a
fairly trivial extension to the buffer.c async logic to allow
asynchronous writeback buffers to be marked as free-after-IO (currently
we only do free-after-IO on anonymous buffer_heads). We already have
the necessary timing logic to do 30-second writeback of buffer_heads,
too.

There's a big reason to keep the page cache dirty, though, and that is
for speed of fsync(). Fortunately, the page->buffers pointer would give
us an easy way of finding (and syncing) all dirty buffers associated
with each page on an inode's page-cache ring.

Cheers,
Stephen.