Re: > 15,000 Simultaneous Connections

Gideon Glass (gid@cobaltnet.com)
Tue, 14 Sep 1999 23:39:01 -0700 (PDT)


> gaurav banga and jeff mogul presented a paper at the june usenix technical
> conference which describes a new kernel API for scalably managing I/O
> events. the link:

> http://www.cs.rice.edu/~gaurav/papers/usenix99.ps

This API is almost identical to the queued signal model with
sigwaitinfo(), except that it cannot give you multiple queues. In other
words, it is a subset of what the signal model gives you.

I believe that the API presented in the Banga/Mogul/Druschel paper
lets you retrieve N events at one time, in order to amortize the cost
of the dequeuing operation. This seems worthwhile given that the
per-event work that an application is often likely to do is just a
single read, write, or close on a socket. I don't know if sigwaitinfo
(or friends) let you dequeue multiple events.

To make the discussion more concrete, here's an API along the lines of
the paper mentioned above, except it avoids the limitations associated
with a single implicit per-process event queue:

/*
* cp_alloc: allocate and return a "completion port", which is actually
* just a new type of file descriptor:
*/
int cp_alloc(void);

/* event types, same as usenix99.ps */
#define EVBIT_READ 1
#define EVBIT_WRITE 2
#define EVBIT_EXCEPT 4

/*
* cp_register: given a completion port file descriptor cp_fd,
* a socket s, and a bitmask of EVBIT_* event types,
* insert s into the set of sockets for which cp_fd will report events.
*
* semantics:
* interestmask == 0 causes s to be unregistered with respect to cp_fd.
* any given socket can only be registered on one cp_fd at a time.
*/
int cp_register(int cp_fd, int s, int interestmask);

/* cp_event: per-file-descriptor report of what kind of I/O took place *
struct cp_event {
int ev_fd;
int ev_mask;
};

/*
* cp_dequeue_events: retrieve nr_events queued events,
* or fewer if we time out. return value is number of events dequeued.
* cp_fd is a completion port fd allocated with cp_alloc.
*
* semantics:
* dequeue events in FIFO order. multiple events of the same time
* on a given FD are only reported once (at the first occurrence for
* that event type).
*
* I'm not sure it makes sense to let multiple tasks be inside
* cp_dequeue_event on the same cp_fd at the same time, so this might
* be disallowed (return EAGAIN or something).
*/
int cp_dequeue_events(int cp_fd, int nr_events,
struct cp_event * events, struct timeval * timeout);

Advantanges over sigwaitinfo():

- can dequeue multiple events in one call

- Queues would be automatically shared -- iff file descriptors are shared.

- Since there is no implicit per-process state (e.g. signal handlers), we
avoid the need for library and application code to fight it out over
who gets what signals.


Comments?

gid

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/