Re: I/O completion ports for Linux

Michael Nelson (mikenel@wam.umd.edu)
Tue, 31 Mar 1998 23:16:52 -0500 (EST)


Does it pick the most recently used threads first like NT?

On Fri, 27 Mar 1998, Robey Pointer wrote:

> The following is a patch against Linux 2.1.84 to implement "I/O
> completion ports" similar to VMS/NT. It will probably patch against
> more recent kernels, with slight changes. (I used 84 because it was
> the last one I could boot from when I started the patch -- the next
> version of the patch will be against one of the 2.1.9x probably.)
>
> An I/O completion port is a special fd that receives notification of
> completed I/O (hence the name...) from other fd's. You can start an
> I/O operation from one thread, and any other thread can pick up the
> result later, when the I/O is finished.
>
> The general concept is: You open an IOCP, and associate file
> descriptors with it via ioctl(). Read/write operations on that
> fd will now always return EINPROGRESS, indicating that the I/O
> request has been queued. A read() on the IOCP will eventually
> pull out a data structure containing all the information from
> the original read/write call, plus the results.
>
> Normal unix asynchronous I/O uses signals to notify a process/thread
> that some I/O operation won't block, but no information is given to
> the process beyond that signal. Solaris' aio_read/write (when they
> are implemented) will queue the request but will still use signals
> to notify the process of completion. Linux IOCP avoids signals
> altogether.
>
> I've included a small test program, "iocp-test.c", to show how
> IOCP's work.
>
> This patch is intended as a "proof of concept" only -- it works (for
> me), but isn't ready for prime time yet. Issues that need to be
> solved:
>
> * In this patch, each IOCP can only have 15 pending I/O operations
> on it. In reality, applications would probably want to queue a
> lot more, and the queue should be allocated dynamically. Which
> also means that there will need to be a mechanism to keep users
> from abusing it by creating a bunch of IOCP's and sucking up lots
> of memory that way.
>
> * read() on an IOCP is implemented, but not poll() yet. (Should
> be easy, I just haven't bothered yet.)
>
> * The method of opening an IOCP is a total hack (creating a bogus
> flag in the open() syscall). A better way would probably be to
> create a device like "/dev/iocp" which hands out IOCPs, but I'm
> not familiar enough with Linux's device code to try this yet.
> (And I'm not entirely sure it can be done without assigning a
> different minor# to each IOCP, which would suck.)
>
> * Only read() and write() syscalls are supported in this patch,
> and only for sockets. Like I said, it's just a proof of concept.
>
> Please send comments and suggestions! This is my first attempt at a
> kernel patch. :)
>
> Robey
> --
> Robey Pointer | "So that's what an invisible barrier
> robey@lag.net | looks like." -Time Bandits
> http://www.lag.net/~robey | (join the 90's retro bandwagon early!)
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu