[Fwd: Re: Comparing the aio and epoll event frameworks.]

From: John Myers (jgmyers@netscape.com)
Date: Wed May 21 2003 - 12:22:34 EST


Previously bounced due to some internal error on vger.


attached mail follows:


Davide Libenzi wrote:

>
>Hi John, you seem to have lost a few episodes of the epoll saga. You can
>use epoll in both Edge Triggered or Level Triggered ways
>
I was aware of that.

>You can easily do thread pooling also.
>
Using epoll with thread pooling has the problems I describe. You can
get multiple threads simultaneously handling the same event. This is
particularly true when using epoll in level triggered mode.
ep_reinject_items() reinjects items immediately before returning from
sys_epoll_wait(), so any second thread calling epoll_wait() shortly
thereafter is likely to also get a copy of the event. In edge triggered
mode, the window is significantly limited, but it is still there.

One can work around this issue by having user space maintain its own
globally locked data structure containing its idea of the current epoll
state, but this wastes CPU and becomes a likely site for locking
contention. The kernel is already serializing its own access to the
struct eventpoll; user space should be able to exploit that.

>Is poll/select a single threading API ?
>
Yes.

> A thread pooling one ?
>
No. You have to have a single thread calling poll/select on any given
set of file descriptors. The resulting events can then be farmed out to
threads using some other synchronization method, but the API can only
reasonably deliver events to that single calling thread.

Another difference I hadn't noticed before is that aio's
sys_io_getevents() uses wake-one semantics, whereas epoll's
sys_epoll_wait() appears to use wake-all semantics. Wake-one semantics
are important for thread pool callers in order to avoid thundering herd
performance problems. Aio unfortunately appears to wake up threads in
FIFO order, which results in pessimal use of cache. This should be
changed to LIFO order.





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri May 23 2003 - 22:00:45 EST