Suggestion for O_DIRECT and friends

Arjan van de Ven (arjan@stack.nl)
Sun, 20 Dec 1998 16:13:24 +0100 (CET)


This e-mail tries to give my view of the results of the recent
discussion about raw device-IO and mmap-readahead. It consists of three
"suggention-flags" that can be used on fopen. The mm-people already opinions
about this stuff, consider this document as a suggestion of how things
_could_ become in 2.3.xxx

File-level
----------

O_Direct
Hints the kernel that caching and read-ahead for requests to this
file are not a good idea. This usually means that the application
knows access-patternsbetter and performs it's own caching; any
memory that would be spend on kernel-caching is better used for this
application.

Possible usage:
* Databases (Oracle and others)
* Scientific (Huge matrices)
* All files that you need only once (configuration files)

Kernels view:
If blocks are in the memory already, give them to the app.
If they aren't read them in, give them to the app but don't
cache them. On write, flush and release all blocks.

O_Serialized
Hints the kernel that the file is read in a serial/linear way,
and that caching old results isn't usefull but that Read-ahead,
Read-behind and Write-clustering are.

Possible usage:
* Multimedia applications (video-streaming, MP3-playing)
* tar, gzip and others
* grep
* Reading configuration-files

Kernels view:
Read:
Read-ahead is a good idea, so it can read-ahead more than usual.
Pages that have been sent to the appl. can be freed immediatly.
Write:
Cluster writes, once written to disk, free the pages immediatly.

O_Entirely
Hints the kernel that the entire contents in the file will be
used. This way, read-ahead is definitly a good idea, and if
the file is "small", it can be read entirely and asynchronously on
fopen, even before any data is actually read. This can make the
reads effectively "non-blocking"

Possible usage:
* in a C-compiler: .c and .h files
* opening shell-scripts
* in a webserver for documents that are to be sent
* all _files_ with the sticky-bit

Kernels view:
On a fopen, all the blocks (for a "small" file) are scheduled to
be read from "disk". For large files, only the first N blocks are
scheduled.
On a read, the blocks already are in memory (hopefully). For a large
file, if more than M blocks are read in so far, the next M blocks
are read-ahead pre-emtively

Kernel-level
------------

Zero-copy TCP/IP
It is possible to have a zero-copy TCP/IP stack, at least up to
a certain extend. Don't ask me how;
I know this because a fellow-student has implemented this in a
"embedded TCP/IP" device [aka settop-box].

Device-DMA to file
Sendfile and friends seem like a good idea here. This can be
used for video/audio-capture. Maybe a streamfile() that "connects"
to files and transfers all data between them is an option.

Greetings,
Arjan van de Ven

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/