Re: Buffer and page cache

Linus Torvalds (torvalds@transmeta.com)
2 Nov 1999 19:17:17 GMT


In article <381F0018.988A9F07@cs.cmu.edu>, <braam@cs.cmu.edu> wrote:
>
>I'm working on a file system which talks to an "inode disk", the storage
>industry calls these object based disks. A simulated object based disk
>can be constructed from the lower half of ext2 (or any other file system
>for that matter).
>
>The file system has no knowledge of disk blocks, and solely uses the
>page cache.

Look at NFS. It sounds like your filesystem should use the page cache
exactly the way NFS uses it.

>I'd like these pages to age a little before handing them over to the
>"inode disk", because the "write_one_page" function called by
>generic_file_write would incur significant latency if the inode disk is
>"real", ie. not simulated in the same system.

No no no. That's NOT how you should see write_one_page() at all.

write_one_page() really only tells you that the user wrote to the page.
It's up to you to write it back some time later, using whatever
mechanism you want (and no, that mechanism does NOT have to be a buffer
head, in fact the whole setup was explictly designed to work with
arbitrary filesystems, with NFS being the "example" fs that was used to
validate the earliest versions).

If you want to delay the write, that's fine, and you should do so: you
should just set up your own data-structures to remember which parts of
the page have been written, and do the right coalescing etc, and then
have some timeout mechanism to write them _eventually_. You'd also
couple it with some way to just force out the write when there are lots
of other writes coming in - you don't want to end up with one large
burst.

And, surprise, surprise, you can just steal the code in NFS to do
exactly this. It's all in

linux/fs/nfs/write.c

and it's actually fairly simple. Each write request makes sure the page
stays in memory by incrementing the page count while a delayed write
exists.

It gets more complicated if you want to coalesce writes over multiple
pages etc, but even that is by no means rocket science. It's just a lot
more details to keep track about.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/